Insect p53 tumor suppressor genes and proteins

ABSTRACT

A family of p53 tumor suppressor nucleic acid and protein isolated from several insect species is described. The p53 nucleic acid and protein can be used to genetically modify metazoan invertebrate organisms, such as insects and worms, or cultured cells, resulting in p53 expression or mis-expression. The genetically modified organisms or cells can be used in screening assays to identify candidate compounds that are potential pesticidal agents or therapeutics that interact with p53 protein. They can also be used in methods for studying p53 activity and identifying other genes that modulate the function of, or interact with, the p53 gene. Nucleic acid and protein sequences for Drosophila p33 and Rb tumor suppressors are also described.

REFERENCE TO RELATED APPLICATION

[0001] This application is a continuation-in-part of U.S. applicationSer. No. 09/268,969, filed Mar. 16, 1999; and of U.S. application No.60/184,373 of same title, filed Feb. 23, 2000. The entire contents ofboth prior applications are incorporated herein by reference.

BACKGROUND OF THE INVENTION

[0002] The p53 gene is mutated in over 50 different types of humancancers, including familial and spontaneous cancers, and is believed tobe the most commonly mutated gene in human cancer (Zambetti and Levine,FASEB (1993) 7:855-865; Hollstein, et al., Nucleic Acids Res. (1994)22:3551-3555). Greater than 90% of mutations in the p53 gene aremissense mutations that alter a single amino acid that inactivates p53function. Aberrant forms of human p53 are associated with poorprognosis, more aggressive tumors, metastasis, and survival rates ofless than 5 years (Koshland, Science (1993) 262:1953).

[0003] The human p53 protein normally functions as a central integratorof signals arising from different forms of cellular stress, includingDNA damage, hypoxia, nucleotide deprivation, and oncogene activation(Prives, Cell (1998) 95:5-8). In response to these signals, p53 proteinlevels are greatly increased with the result that the accumulated p53activates pathways of cell cycle arrest or apoptosis depending on thenature and strength of these signals. Indeed, multiple lines ofexperimental evidence have pointed to a key role for p53 as a tumorsuppressor (Levine, Cell (1997) 88:323-331). For example, homozygous p53“knockout” mice are developmentally normal but exhibit nearly 100%incidence of neoplasia in the first year of life (Donehower et al.,Nature (1992) 356:215-221). The biochemical mechanisms and pathwaysthrough which p53 functions in normal and cancerous cells are not fullyunderstood, but one clearly important aspect of p53 function is itsactivity as a gene-specific transcriptional activator. Among the geneswith known p53-response elements are several with well-characterizedroles in either regulation of the cell cycle or apoptosis, includingGADD45, p21/Waf1/Cip1, cyclin G, Bax, IGF-BP3, and MDM2 (Levine, Cell(1997) 88:323-331).

[0004] Human p53 is a 393 amino acid phosphoprotein which is dividedstructurally and functionally into distinct domains joined in thefollowing order from N-terminus to C-terminus of the polypeptide chain:(a) a transcriptional activation domain; (b) a sequence-specificDNA-binding domain; (c) a linker domain; (d) an oligomerization domain;and (e) a basic regulatory domain. Other structural details of the p53protein are in keeping with its function as a sequence-specific geneactivator that responds to a variety of stress signals. For example, themost N-terminal domain of p53 is rich in acidic residues, consistentwith structural features of other transcriptional activators (Fields andJang, Science (1990) 249:1046-49). By contrast, the most C-terminaldomain of p53 is rich in basic residues, and has the ability to bindsingle-stranded DNA, double-stranded DNA ends, and internal deletionsloops (Jayaraman and Prives, Cell (1995) 81: 1021-1029). The associationof the p53 C-terminal basic regulatory domain with these forms of DNAthat are generated during DNA repair may trigger conversion of p53 froma latent to an activated state capable of site-specific DNA binding totarget genes (Hupp and Lane, Curr. Biol. (1994) 4: 865-875), therebyproviding one mechanism to regulate p53 function in response to DNAdamage. Importantly, both the N-terminal activation domain and theC-terminal basic regulatory domain of p53 are subject to numerouscovalent modifications which correlate with stress-induced signals(Prives, Cell (1998) 95:5-8). For example, the N-terminal activationdomain contains residues that are targets for phosphorylation by theDNA-activated protein kinase, the ATM kinase, and the cyclin activatedkinase complex. The C-terminal basic regulatory domain contains residuesthat are targets for phosphorylation by protein kinase-C, cyclindependent kinase, and casein kinase II, as well as residues that aretargets for acetylation by PCAF and p300 acetyl transferases. p53activity is also modulated by specific non-covalent protein-proteininteractions (Ko and Prives, Genes Dev. (1996) 10: 1054-1072). Mostnotably, the MDM2 protein binds a short, highly conserved proteinsequence motif, residues 13-29, in the N-terminal activation domain ofp53 (Kussie et al., Science (1996) 274:948-953. As a result of bindingp53, MDM2 both represses p53 transcriptional activity and promotes thedegradation of p53.

[0005] Although several mammalian and vertebrate homologs of the tumorsuppressor p53 have been described, only two invertebrate homologs havebeen identified to date in mollusc and squid. Few lines of evidence,however, have hinted at the existence of a p53 homolog in any otherinvertebrate species, such as the fruit fly Drosophila. Indeed, numerousdirect attempts to isolate a Drosophila p53 gene by eithercross-hybridization or PCR have failed to identify a p53-like gene inthis species (Soussi et al., Oncogene (1990) 5: 945-952). However, otherstudies of response to DNA damage in insect cells using nucleiccross-hybridization and antibody cross-reactivity have providedsuggestive evidence for existence of p53-, p21-, and MDM2-like genes(Bae et al., Exp Cell Res (1995) 375:105-106; Yakes, 1994, Ph.D. thesis,Wayne State University). Nonetheless, no isolated insect p53 genes orproteins have been reported to date.

[0006] Identification of novel p53 orthologues in model organisms suchas Drosophila melanogaster and other insect species provides importantand useful tools for genetic and molecular study and validation of thesemolecules as potential pharmaceutical and pesticide targets. The presentinvention discloses insect p53 genes and proteins from a variety ofdiverse insect species. In addition, Drosophila homologs of p33 and Rbgenes, which are also involved in tumor suppression, are described.

SUMMARY OF THE INVENTION

[0007] It is an object of the present invention to provide insect p53nucleic acid and protein sequences that can be used in genetic screeningmethods to characterize pathways that p53 may be involved in as well asother interacting genetic pathways. It is also an object of theinvention to provide methods for screening compounds that interact withp53 such as those that may have utility as therapeutics.

[0008] These and other objects are provided by the present inventionwhich concerns the identification and characterization of insect p53genes and proteins in a variety of insect species. Isolated nucleic acidmolecules are provided that comprise nucleic acid sequences encoding p53polypeptides and derivatives thereof. Vectors and host cells comprisingthe p53 nucleic acid molecules are also described, as well as metazoaninvertebrate organisms (e.g. insects, coelomates and pseudocoelomates)that are genetically modified to express or mis-express a p53 protein.

[0009] An important utility of the insect p53 nucleic acids and proteinsis that they can be used in screening assays to identify candidatecompounds which are potential therapeutics or pesticides that interactwith p53 proteins. Such assays typically comprise contacting a p53polypeptide with one or more candidate molecules, and detecting anyinteraction between the candidate compound and the p53 polypeptide. Theassays may comprise adding the candidate molecules to cultures of cellsgenetically engineered to express p53 proteins, or alternatively,administering the candidate compound to a metazoan invertebrate organismgenetically engineered to express p53 protein.

[0010] The genetically engineered metazoan invertebrate animals of theinvention can also be used in methods for studying p53 activity, or forvalidating therapeutic or pesticidal strategies based on manipulation ofthe p53 pathway. These methods typically involve detecting the phenotypecaused by the expression or mis-expression of the p53 protein. Themethods may additionally comprise observing a second animal that has thesame genetic modification as the first animal and, additionally has amutation in a gene of interest. Any difference between the phenotypes ofthe two animals identifies the gene of interest as capable of modifyingthe function of the gene encoding the p53 protein.

BRIEF DESCRIPTION OF THE FIGURE

[0011] FIGS. 1A-1B show a CLUSTALW alignment of the amino acid sequencesof the insect p53 proteins identified from Drosophila, Leptinotarsa,Tribolium, and Heliothis, with p53 sequences previously identified inhuman, Xenopus, and squid. Identical amino acid residues within thealignment are grouped within solid lines and similar amino acid residuesare grouped within dashed lines.

DETAILED DESCRIPTION OF THE INVENTION

[0012] The use of invertebrate model organism genetics and relatedtechnologies can greatly facilitate the elucidation of biologicalpathways (Scangos, Nat. Biotechnol. (1997) 15:1220-1221; Margolis andDuyk, Nature Biotech. (1998) 16:311). Of particular use is the insectmodel organism, Drosophila melanogaster (hereinafter referred togenerally as “Drosophila”). An extensive search for p53 nucleic acid andits encoded protein in Drosophila was conducted in an attempt toidentify new and useful tools for probing the function and regulation ofthe p53 genes, and for use as targets in drug discovery. p53 nucleicacid has also been identified in the following additional insectspecies: Leptinotarsa decemilineata (Colorado potato beetle, hereinafterreferred to as Leptinotarsa), Tribolium castaneum (flour beetle,hereinafter referred to as Tribolium), and Heliothis virescens (tobaccobudworm, hereinafter referred to as Heliothis).

[0013] The newly identified insect p53 nucleic acids can be used for thegeneration of mutant phenotypes in animal models or in living cells thatcan be used to study regulation of p53, and the use of p53 as a drug orpesticide target. Due to the ability to rapidly carry out large-scale,systematic genetic screens, the use of invertebrate model organisms suchas Drosophila has great utility for analyzing the expression andmis-expression of p53 protein. Thus, the invention provides a superiorapproach for identifying other components involved in the synthesis,activity, and regulation of p53 proteins. Systematic genetic analysis ofp53 using invertebrate model organisms can lead to the identificationand validation of compound targets directed to components of the p53pathway. Model organisms or cultured cells that have been geneticallyengineered to express p53 can be used to screen candidate compounds fortheir ability to modulate p53 expression or activity, and thus areuseful in the identification of new drug targets, therapeutic agents,diagnostics and prognostics useful in the treatment of disordersassociated with cell cycle, DNA repair, and apoptosis. The details ofthe conditions used for the identification and/or isolation of insectp53 nucleic acids and proteins are described in the Examples sectionbelow. Various non-limiting embodiments of the invention, applicationsand uses of the insect p53 genes and proteins are discussed in thefollowing sections. The entire contents of all references, includingpatent applications, cited herein are incorporated by reference in theirentireties for all purposes. Additionally, the citation of a referencein the preceding background section is not an admission of prior artagainst the claims appended hereto.

[0014] p53 Nucleic Acids

[0015] The following nucleic acid sequences encoding insect p53 aredescribed herein: SEQ ID NO:1, isolated from Drosophila, and referred toherein as DMp53; SEQ ID NO:3, isolated from Leptinotarsa, and referredto herein as CPBp53; SEQ ID NO:5 and SEQ ID NO:7, isolated fromTribolium, and referred to herein as TRIB-Ap53 and TRIB-Bp53,respectively; and SEQ ID NO:9, isolated from Heliothis, and referred toherein as HELIOp53. The genomic sequence of the DMp53 gene is providedin SEQ ID NO:18.

[0016] In addition to the fragments and derivatives of SEQ ID NOs:1, 3,5, 7, 9, and 18, as described in detail below, the invention includesthe reverse complements thereof. Also, the subject nucleic acidsequences, derivatives and fragments thereof may be RNA moleculescomprising the nucleotide sequences of SEQ ID NOs:1, 3, 5, 7, 9, and 18(or derivative or fragment thereof) wherein the base U (uracil) issubstituted for the base T (thymine). The DNA and RNA sequences of theinvention can be single- or double-stranded. Thus, the term “isolatednucleic acid sequence” or “isolated nucleic acid molecule”, as usedherein, includes the reverse complement, RNA equivalent, DNA or RNAsingle- or double-stranded sequences, and DNA/RNA hybrids of thesequence being described, unless otherwise indicated.

[0017] Fragments of the p53 nucleic acid sequences can be used for avariety of purposes. Interfering RNA (RNAi) fragments, particularlydouble-stranded (ds) RNAi, can be used to generate loss-of-functionphenotypes. p53 nucleic acid fragments are also useful as nucleic acidhybridization probes and replication/amplification primers. Certain“antisense” fragments, i.e. that are reverse complements of portions ofthe coding sequence of any of SEQ ID NO:1, 3, 5, 7, 9, or 18 haveutility in inhibiting the function of p53 proteins. The fragments are oflength sufficient to specifically hybridize with the corresponding SEQID NO:1, 3, 5, 7, 9, or 18. The fragments consist of or comprise atleast 12, preferably at least 24, more preferably at least 36, and morepreferably at least 96 contiguous nucleotides of any one of SEQ IDNOs:1, 3, 5, 7, 9, and 18. When the fragments are flanked by othernucleic acid sequences, the total length of the combined nucleic acidsequence is less than 15 kb, preferably less than 10 kb or less than 5kb, more preferably less than 2 kb, and in some cases, preferably lessthan 500 bases. Preferred p53 nucleic acid fragments comprise regulatoryelements that may reside in the 5′ UTR and/or encode one or more of thefollowing domains: an activation domain, a DNA binding domain, a linkerdomain, an oligomerization domain, and a basic regulatory domain. Theapproximate locations of these regions in SEQ ID Nos 1, 3, and 5, and inthe corresponding amino acid sequences of SEQ ID Nos 2,4, and 6, 8, areprovided in Table 1. TABLE 1 SEQ ID NOs 1/2 3/4 5/6 Insect GenusDrosophila Leptinotarsa Tribolium 5′ UTR na 1-111 na 1-120 na 1-93Activation Domain na 112-257 na 121-300 na 94-277 aa 1-48 aa 1-60 aa1-60 DNA Binding Domain na 366-954 na 321-936 na 280-892 aa 85-280 aa67-271 aa 62-265 Linker Domain na 999-1056 na 937-999 na 893-958 aa296-314 aa 272-292 aa 266-287 Oligomerization na 1065-1170 na 1000-1113na 959-1075 Domain aa 318-352 aa 293-330 aa 288-326 Basic Regulatory na1179-1269 na 1114-1182 na 1076-1147 Domain aa 356-385 aa 331-353 aa327-350

[0018] Further preferred are fragments of bases 354-495 of SEQ ID NO:7and bases 315-414 of SEQ ID NO:9 of at least 12, preferably at least 24,more preferably at least 36, and most preferably at least 96 contiguousnucleotides.

[0019] The subject nucleic acid sequences may consist solely of any oneof SEQ ID NOs:1, 3, 5, 7, 9, or 18, or fragments thereof. Alternatively,the subject nucleic acid sequences and fragments thereof may be joinedto other components such as labels, peptides, agents that facilitatetransport across cell membranes, hybridization-triggered cleavage agentsor intercalating agents. The subject nucleic acid sequences andfragments thereof may also be joined to other nucleic acid sequences(i.e. they may comprise part of larger sequences) and are ofsynthetic/non-natural sequences and/or are isolated and/or are purified,i.e. unaccompanied by at least some of the material with which it isassociated in its natural state. Preferably, the isolated nucleic acidsconstitute at least about 0.5%, and more preferably at least about 5% byweight of the total nucleic acid present in a given fraction, and arepreferably recombinant, meaning that they comprise a non-naturalsequence or a natural sequence joined to nucleotide(s) other than thatwhich it is joined to on a natural chromosome.

[0020] Derivative nucleic acid sequences of p53 include sequences thathybridize to the nucleic acid sequence of SEQ ID NOs:1, 3, 5, 7, 9, or18 under stringency conditions such that the hybridizing derivativenucleic acid is related to the subject nucleic acid by a certain degreeof sequence identity. A nucleic acid molecule is “hybridizable” toanother nucleic acid molecule, such as a cDNA, genomic DNA, or RNA, whena single stranded form of the nucleic acid molecule can anneal to theother nucleic acid molecule. Stringency of hybridization refers toconditions under which nucleic acids are hybridizable. The degree ofstringency can be controlled by temperature, ionic strength, pH, and thepresence of denaturing agents such as formamide during hybridization andwashing. As used herein, the term “stringent hybridization conditions”are those normally used by one of skill in the art to establish at leastabout a 90% sequence identity between complementary pieces of DNA or DNAand RNA. “Moderately stringent hybridization conditions” are used tofind derivatives having at least about a 70% sequence identity. Finally,“low-stringency hybridization conditions” are used to isolate derivativenucleic acid molecules that share at least about 50% sequence identitywith the subject nucleic acid sequence.

[0021] The ultimate hybridization stringency reflects both the actualhybridization conditions as well as the washing conditions following thehybridization, and it is well known in the art how to vary theconditions to obtain the desired result. Conditions routinely used areset out in readily available procedure texts (e.g., Current Protocol inMolecular Biology, Vol. 1, Chap. 2.10, John Wiley & Sons, Publishers(1994); Sambrook et al., Molecular Cloning, Cold Spring Harbor (1989)).A preferred derivative nucleic acid is capable of hybridizing to any oneof SEQ ID NOs:1, 3, 5, 7, 9, or 18 under stringent hybridizationconditions that comprise: prehybridization of filters containing nucleicacid for 8 hours to overnight at 65° C. in a solution comprising 6×single strength citrate (SSC) (1×SSC is 0.15 M NaCl, 0.015 M Na citrate;pH 7.0), 5×Denhardt's solution, 0.05% sodium pyrophosphate and 100 μg/mlherring sperm DNA; hybridization for 18-20 hours at 65° C. in a solutioncontaining 6×SSC, 1×Denhardt's solution, 100 μg/ml yeast tRNA and 0.05%sodium pyrophosphate; and washing of filters at 65° C. for 1 h in asolution containing 0.2×SSC and 0.1% SDS (sodium dodecyl sulfate).

[0022] Derivative nucleic acid sequences that have at least about 70%sequence identity with any one of SEQ ID NOs:1, 3, 5, 7, 9, and 18 arecapable of hybridizing to any one of SEQ ID NO:1, 3, 5, 7, 9, and 18under moderately stringent conditions that comprise: pretreatment offilters containing nucleic acid for 6 h at 40° C. in a solutioncontaining 35% formamide, 5×SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA,0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 μg/ml denatured salmon sperm DNA;hybridization for 18-20 h at 40° C. in a solution containing 35%formamide, 5×SSC, 50 mM Tris-HCI (pH 7.5), 5 mM EDTA, 0.02% PVP, 0.02%Ficoll, 0.2% BSA, 100 μg/ml salmon sperm DNA, and 10% (wt/vol) dextransulfate; followed by washing twice for 1 hour at 55° C. in a solutioncontaining 2×SSC and 0.1% SDS.

[0023] Other preferred derivative nucleic acid sequences are capable ofhybridizing to any one of SEQ ID NOs:1, 3, 5, 7, 9, and 18 under lowstringency conditions that comprise: incubation for 8 hours to overnightat 37° C. in a solution comprising 20% formamide, 5×SSC, 50 mM sodiumphosphate (pH 7.6), 5×Denhardt's solution, 10% dextran sulfate, and 20μg/ml denatured sheared salmon sperm DNA; hybridization in the samebuffer for 18 to 20 hours; and washing of filters in 1×SSC at about 37°C. for 1 hour.

[0024] As used herein, “percent (%) nucleic acid sequence identity” withrespect to a subject sequence, or a specified portion of a subjectsequence, is defined as the percentage of nucleotides in the candidatederivative nucleic acid sequence identical with the nucleotides in thesubject sequence (or specified portion thereof), after aligning thesequences and introducing gaps, if necessary to achieve the maximumpercent sequence identity, as generated by the program WU-BLAST-2.0a19(Altschul et al., J. Mol. Biol. (1997) 215:403-410;http://blast.wustl.edu/blast/README.html; hereinafter referred togenerally as “BLAST”) with all the search parameters set to defaultvalues. The HSP S and HSP S2 parameters are dynamic values and areestablished by the program itself depending upon the composition of theparticular sequence and composition of the particular database againstwhich the sequence of interest is being searched. A percent (%) nucleicacid sequence identity value is determined by the number of matchingidentical nucleotides divided by the sequence length for which thepercent identity is being reported.

[0025] Derivative p53 nucleic acid sequences usually have at least 50%sequence identity, preferably at least 60%, 70%, or 80% sequenceidentity, more preferably at least 85% sequence identity, still morepreferably at least 90% sequence identity, and most preferably at least95% sequence identity with any one of SEQ ID NOs:1, 3, 5, 7, 9, or 18,or domain-encoding regions thereof.

[0026] In one preferred embodiment, the derivative nucleic acid encodesa polypeptide comprising a p53 amino acid sequence of any one of SEQ IDNOs:2, 4, 6, 8, or 10, or a fragment or derivative thereof as describedfurther below under the subheading “p53 proteins”. A derivative p53nucleic acid sequence, or fragment thereof, may comprise 100% sequenceidentity with any one of SEQ ID NOs:1, 3, 5, 7, 9, or 18, but be aderivative thereof in the sense that it has one or more modifications atthe base or sugar moiety, or phosphate backbone. Examples ofmodifications are well known in the art (Bailey, Ullmann's Encyclopediaof Industrial Chemistry (1998), 6th ed. Wiley and Sons). Suchderivatives may be used to provide modified stability or any otherdesired property.

[0027] Another type of derivative of the subject nucleic acid sequencesincludes corresponding humanized sequences. A humanized nucleic acidsequence is one in which one or more codons has been substituted with acodon that is more commonly used in human genes. Preferably, asufficient number of codons have been substituted such that a higherlevel expression is achieved in mammalian cells than what wouldotherwise be achieved without the substitutions. The following listshows, for each amino acid, the calculated codon frequency (number inparentheses) in humans genes for 1000 codons (Wada et al., Nucleic AcidsResearch (1990) 18(Suppl.):2367-2411): Human codon frequency per 1000codons: ARG: CGA (5.4), CGC (11.3), CGG (10.4), CGU (4.7), AGA (9.9),AGG (11.1) LEU: CUA (6.2), CUC (19.9), CUG (42.5), CUU (10.7), UUA(5.3), UUG (11.0) SER: UCA (9.3), UCC (17.7), UCG (4.2), UCU (13.2), AGC(18.7), AGU (9.4) THR: ACA (14.4), ACC (23.0), ACG (6.7), ACU (12.7)PRO: CCA (14.6), CCC (20.0), CCG (6.6), CCU (15.5) ALA: GCA (14.0), GCC(29.1), GCG (7.2), GCU (19.6) GLY: GGA (17.1), GGC (25.4), GGG (17.3),GGU (11.2) VAL: GUA (5.9), GUC (16.3), GUG (30.9), GUU (10.4) LYS: AAA(22.2), AAG (34.9) ASN: AAC (22.6), AAU (16.6) GLN: CAA (11.1), CAG(33.6) HIS: CAC (14.2), CAU (9.3) GLU: GAA (26.8), GAG (41.4) ASP: GAC(29.0), GAU (21.7) TYR: UAC (18.8), UAU (12.5) CYS: UGC (14.5), UGU(9.9) PHE: UUU (22.6), UUC (15.8) ILE: AUA (5.8), AUC (24.3), AUU (14.9)MET: AUG (22.3) TRP: UGG (13.8) TER: UAA (0.7), AUG (0.5), UGA (1.2)

[0028] Thus, a p53 nucleic acid sequence in which the glutamic acidcodon, GAA has been replaced with the codon GAG, which is more commonlyused in human genes, is an example of a humanized p53 nucleic acidsequence. A detailed discussion of the humanization of nucleic acidsequences is provided in U.S. Pat. No. 5,874,304 to Zolotukhin et al.Similarly, other nucleic acid derivatives can be generated with codonusage optimized for expression in other organisms, such as yeasts,bacteria, and plants, where it is desired to engineer the expression ofp53 proteins by using specific codons chosen according to the preferredcodons used in highly expressed genes in each organism. More specificembodiments of preferred p53 proteins, fragments, and derivatives arediscussed further below in connection under the subheading “p53proteins”.

[0029] Nucleic acid encoding the amino acid sequence of any of SEQ IDNOs:2, 4, 6, 8, and 10, or fragment or derivative thereof, may beobtained from an appropriate cDNA library prepared from any eukaryoticspecies that encodes p53 proteins such as vertebrates, preferablymammalian (e.g. primate, porcine, bovine, feline, equine, and caninespecies, etc.) and invertebrates, such as arthropods, particularlyinsects species (preferably Drosophila, Tribolium, Leptinotarsa, andHeliothis), acarids, crustacea, molluscs, nematodes, and other worms. Anexpression library can be constructed using known methods. For example,mRNA can be isolated to make cDNA which is ligated into a suitableexpression vector for expression in a host cell into which it isintroduced. Various screening assays can then be used to select for thegene or gene product (e.g. oligonucleotides of at least about 20 to 80bases designed to identify the gene of interest, or labeled antibodiesthat specifically bind to the gene product). The gene and/or geneproduct can then be recovered from the host cell using known techniques.

[0030] Polymerase chain reaction (PCR) can also be used to isolatenucleic acids of the p53 genes where oligonucleotide primersrepresenting fragmentary sequences of interest amplify RNA or DNAsequences from a source such as a genomic or cDNA library (as describedby Sambrook et al., supra). Additionally, degenerate primers foramplifying homologs from any species of interest may be used. Once a PCRproduct of appropriate size and sequence is obtained, it may be clonedand sequenced by standard techniques, and utilized as a probe to isolatea complete cDNA or genomic clone.

[0031] Fragmentary sequences of p53 nucleic acids and derivatives may besynthesized by known methods. For example, oligonucleotides may besynthesized using an automated DNA synthesizer available from commercialsuppliers (e.g. Biosearch, Novato, Calif.; Perkin-Elmer AppliedBiosystems, Foster City, Calif.). Antisense RNA sequences can beproduced intracellularly by transcription from an exogenous sequence,e.g. from vectors that contain antisense p53 nucleic acid sequences.Newly generated sequences may be identified and isolated using standardmethods.

[0032] An isolated p53 nucleic acid sequence can be inserted into anyappropriate cloning vector, for example bacteriophages such as lambdaderivatives, or plasmids such as PBR322, pUC plasmid derivatives and theBluescript vector (Stratagene, San Diego, Calif.). Recombinant moleculescan be introduced into host cells via transformation, transfection,infection, electroporation, etc., or into a transgenic animal such as afly. The transformed cells can be cultured to generate large quantitiesof the p53 nucleic acid. Suitable methods for isolating and producingthe subject nucleic acid sequences are well-known in the art (Sambrooket al., supra; DNA Cloning: A Practical Approach, Vol. 1, 2, 3, 4,(1995) Glover, ed., MRL Press, Ltd., Oxford, U.K.).

[0033] The nucleotide sequence encoding a p53 protein or fragment orderivative thereof, can be inserted into any appropriate expressionvector for the transcription and translation of the insertedprotein-coding sequence. Alternatively, the necessary transcriptionaland translational signals can be supplied by the native p53 gene and/orits flanking regions. A variety of host-vector systems may be utilizedto express the protein-coding sequence such as mammalian cell systemsinfected with virus (e.g. vaccinia virus, adenovirus, etc.); insect cellsystems infected with virus (e.g. baculovirus); microorganisms such asyeast containing yeast vectors, or bacteria transformed withbacteriophage, DNA, plasmid DNA, or cosmid DNA. If expression in plantsis desired, a variety of transformation constructs, vectors and methodsare known in the art (see U.S. Pat. No. 6,002,068 for review).Expression of a p53 protein may be controlled by a suitablepromoter/enhancer element. In addition, a host cell strain may beselected which modulates the expression of the inserted sequences, ormodifies and processes the gene product in the specific fashion desired

[0034] To detect expression of the p53 gene product, the expressionvector can comprise a promoter operably linked to a p53 gene nucleicacid, one or more origins of replication, and, one or more selectablemarkers (e.g. thymidine kinase activity, resistance to antibiotics,etc.). Alternatively, recombinant expression vectors can be identifiedby assaying for the expression of the p53 gene product based on thephysical or functional properties of the p53 protein in in vitro assaysystems (e.g. immunoassays or cell cycle assays). The p53 protein,fragment, or derivative may be optionally expressed as a fusion, orchimeric protein product as described above.

[0035] Once a recombinant that expresses the p53 gene sequence isidentified, the gene product can be isolated and purified using standardmethods (e.g. ion exchange, affinity, and gel exclusion chromatography;centrifugation; differential solubility; electrophoresis). The aminoacid sequence of the protein can be deduced from the nucleotide sequenceof the chimeric gene contained in the recombinant and can thus besynthesized by standard chemical methods (Hunkapiller et al., Nature(1984) 310:105-111). Alternatively, native p53 proteins can be purifiedfrom natural sources, by standard methods (e.g. immunoaffinitypurification).

[0036] p33 and Rb Nucleic Acids

[0037] The invention also provides nucleic acid sequences for Drosophilap33 (SEQ ID NO:19), and Rb (SEQ ID NO:21) tumor suppressors. Derivativesand fragments of these sequences can be prepared as described above forthe p53 sequences. Preferred fragments and derivatives comprise the samenumber of contiguous nucleotides or same degrees of percent identity asdescribed above for p53 nucleic acid sequences. The disclosure belowregarding various uses of p53 tumor suppressor nucleic acids andproteins (e.g. transgenic animals, tumor suppressor assays, etc.) alsoapplies to the p33 and Rb tumor suppressor sequences disclosed herein.

[0038] p53 Proteins

[0039] The CLUSTALW program (Thompson, et al., Nucleic Acids Research(1994) 22(22):4673-4680) was used to align the insect p53 proteinsdescribed herein with p53 proteins from human (Zakut-Houri et al., EMBOJ. (1985) 4:1251-1255; GenBank gi:129369), Xenopus (Sousi et al.,Oncogene (1987) 1:71-78; GenBank gi:129374), and squid (GenBankgi:1244762). The alignment generated is shown in FIG. 1 and reveals anumber of features in the insect p53 proteins that are characteristic ofthe previously-identified p53 proteins. With respect to general areas ofstructural similarity, the DMp53, CPBp53, and TRIB-Ap53 proteins can beroughly divided into three regions: a central region which exhibits ahigh degree of sequence homology with other known p53 family proteinsand which roughly corresponds to the DNA binding domain of this proteinfamily (Cho et al., Science (1994) 265:346-355), and flanking N-terminaland C-terminal regions which exhibit significantly less homology butwhich correspond in overall size to other p53 family proteins. Thefragmentary polypeptide sequences encoded by the TRIB-Bp53 and HELIOp53cDNAs are shown by the multiple sequence alignment to be derived fromthe central region—the conserved DNA-binding domain. Significantly, theprotein sequence alignment allowed the assignment of the domains in theDMp53, CPBp53, and TRIB-A p53 proteins listed in Table 1 above, based onsequence homology with previously characterized domains of human p53(Sousi and May, J. Mol Biol (1996) 260:623-637; Levine, supra; Prives,Cell (1998) 95:5-8).

[0040] Importantly, the most conserved central regions of the DMpS3,CPBp53, and TRIB-A p53 proteins correspond almost precisely to the knownfunctional boundaries of the DNA binding domain of human p53, indicatingthat these proteins are likely to exhibit similar DNA binding propertiesto those of human p53. A detailed examination of the conserved residuesin this domain further emphasizes the likely structural and functionalsimilarities between human p53 and the insect p53 proteins. First,residues of the human p53 known to be involved in direct DNA contacts(K120, S241, R248, R273, C277, and R280) correspond to identical orsimilar residues in the DMp53 protein (K113, S230, R234, K259, C263, andR266), and identical residues in the CPBp53 protein (K92, S216, R224,R249, C253, and R256), and the TRIB-Ap53 protein (K88, S213, R220, R245,C249, and R252). Also, with regard to the overall folding of thisdomain, it was notable that four key residues that coordinate the zincligand in the DNA binding domain of human p53 (C176, H179, C238, andC242) are precisely conserved in the DMp53 protein (C156, H159, C227,and C231), the CPBp53 protein (C147, H150, C213, and C217), and theTRIB-A p53 protein (C144, H147, C210, C214). Furthermore, it wasstriking that the mutational hot spots in human p53 most frequentlyaltered in cancer (R175, G245, R248, R249, R273, and R282), are eitheridentical or conserved amino acid residues in the correspondingpositions of the DMp53 protein (R155, G233, R234, K235, K259, and R268),the CPBp53 protein (R146, G221, R224, R225, R249, and K258), and theTRIB-Ap53 protein (R143, G217, R220, R221, R245, and K254).

[0041] Interestingly, the insect p53s also have distinct differencesfrom the Human, Xenopus, and squid p53s. Specifically, insect p53scontain a unique amino acid sequence within the DNA recognition domainthat has the following sequence: (R or K)(I or V)C(S or T)CPKRD.Specifically, amino acid residues 259 to 267 of DMp53 have the sequence:KICTCPKRD; residues 249 to 257 of CPBp53 have the sequence: RICSCPKRD;and residues 245-253 of TRIB-Ap53 have the sequence: RVCSCPKRD. This isin distinct contrast to the Human, Xenopus, and squid p53s which havethe following corresponding sequence: R(I or V)CACPGRD.

[0042] Another region of insect p53s that distinctly differs frompreviously identified p53s lies in the zinc coordination region of theDNA binding domain. The following sequence is conserved within theinsect p53s: FXC(K or Q)NSC (where X=any amino acid). Specifically,residues 225-231 of DMp53 have the sequence: FVCQNSC; residues 211-217of CPBp53 and residues 208-214 of TRIB-Ap53 have the sequence FVCKNSC;and the corresponding residues in Helio-p53, as shown in FIG. 1, havethe sequence: FSCKNSC. In contrast, the corresponding sequence in Humanand Xenopus p53 is YMCNSSC, and in squid it is FMCLGSC.

[0043] The high degree of structural homology in the presumptive DNAbinding domain of the insect p53 proteins has important implications forengineering derivative (e.g. mutant) forms of these p53 genes for testsof function in vitro and in vivo, and for genetic dissection ormanipulation of the p53 pathway in transgenic insects or insect celllines. Dominant negative forms of human p53 have been generated bycreating altered proteins which have a defective DNA binding domain, butwhich retain a functional oligomerization domain (Brachman et al., ProcNatl Acad Sci USA (1996) 93:4091-4095). Such dominant negative mutantforms are extremely useful for determining the effects ofloss-of-function of p53 in assays of interest. Thus, mutations in highlyconserved positions within the DNA binding domain of the insect p53proteins, which correspond to residues known to be important for thestructure and function of human p53 (such as R175H, H179N, and R280T ofhuman p53), are likely to result in dominant negative forms of insectp53 proteins. For example, specific mutations in the DMp53 protein tocreate dominant negative mutant forms of the protein include R155H,H159N, and R266T and for the TRIB-A p53 protein include R143H, H147N,and R252T.

[0044] Although other domains of the insect p53 proteins, aside from theDNA binding domain, exhibit significantly less homology compared to theknown p53 family proteins, the sequence alignment provides importantinformation about their structure and potential function. Notably, justas in the human p53 protein, the C-terminal 20-25 amino acids of theprotein comprise a putative region that extends beyond theoligomerization domain, suggesting an analogous function for this regionof the insect p53 proteins in regulating activity of the protein. Sincedeletion of the C-terminal regulatory domain in human p53 has been shownto generate constitutively activated forms of the protein (Hupp andLane, Curr. Biol. (1994) 4:865-875), it is expected that removal of mostor all of the corresponding regulatory domain from the insect p53proteins will generate an activated protein form. Thus preferredtruncated forms of the insect p53 proteins lack at least 10 C-terminalamino acids, more preferably at least 15 amino acids, and mostpreferably at least 20 C-terminal amino acids. For example, a preferredtruncated version of DMp53 comprises amino acid residues 1-376, morepreferably residues 1-371, and most preferably residues 1-366 of SEQ IDNO:2. Such constitutively activated mutant forms of the protein are veryuseful for tests of protein function using in vivo and in vitro assays,as well as for genetic analysis.

[0045] The oligomerization domain of the insect p53 proteins exhibitvery limited skeletal sequence homology with other p53 family proteins,although the length of this region is similar to that of other p53family proteins. The extent of sequence divergence in this region of theinsect proteins raises the possibility that the insect p53 protein maybe unable to form hetero-oligomers with p53 proteins from vertebrates orsquid. And, although the linker domain located between the DNA bindingand oligomerization domains also exhibits relatively little sequenceconservation, this region of any of the DMp53, CPBp53, and TRIB-A p53proteins contains predicted nuclear localization signals similar tothose identified in human p53 (Shaulsky et al., Mol Cell Biol (1990)10:6565-6577).

[0046] The activation domain at the N-terminus of the insect p53proteins also exhibits little sequence identity with other p53 familyproteins, although the size of this region is roughly the same as thatof human p53. Nonetheless, an important feature of this domain is therelative concentration of acidic residues in the insect p53 proteins.Consequently, it is likely that this N-terminal domain of any of theDMp53, CPBp53, and TRIB-Ap53 proteins will similarly exert thefunctional activity of a transcriptional activation domain to that ofthe human p53 domain (Thut et al., Science (1995) 267:100-104).Interestingly, the DMp53, CPBp53 and TRIB-A p53 proteins do not appearto possess a highly conserved sequence motif, FxxLWxxL, found at theN-terminus of vertebrate and squid p53 family proteins. In the human p53gene, these conserved residues in this motif participate in a specificinteraction between human p53 proteins and mdm2 (Kussie et al., Science(1996) 274:948-953).

[0047] It is important to note that, although there is no sequencesimilarity between the insect p53s and other p53 family members in theC- and N-termini, these regions of p53 contain secondary structurecharacteristic of p53-related proteins. For example, the human p53 bindsDNA as a homo-tetramer and self-association is mediated by a β-sheet andamphipathic α-helix located in the C-terminus of the protein. A similarβ-sheet-turn-α-helix is predicted in the C-terminus of DMp53. Further,the N-terminus of the human p53 is a region that includes atransactivation domain and residues critical for binding to the mdm-2protein. The N-terminus of the DMp53 also include acidic amino acids andlikely functions as a transactivation domain.

[0048] p53 proteins of the invention comprise or consist of an aminoacid sequence of any one of SEQ ID NOs:2, 4, 6, 8, and 10 or fragmentsor derivatives thereof. Compositions comprising these proteins mayconsist essentially of the p53 protein, fragments, or derivatives, ormay comprise additional components (e.g. pharmaceutically acceptablecarriers or excipients, culture media, etc.). p53 protein derivativestypically share a certain degree of sequence identity or sequencesimilarity with any one of SEQ ID NOs:2, 4, 6, 8, and 10 or fragmentsthereof. As used herein, “percent (%) amino acid sequence identity” withrespect to a subject sequence, or a specified portion of a subjectsequence, is defined as the percentage of amino acids in the candidatederivative amino acid sequence identical with the amino acid in thesubject sequence (or specified portion thereof), after aligning thesequences and introducing gaps, if necessary to achieve the maximumpercent sequence identity, as generated by BLAST (Altschul et al.,supra) using the same parameters discussed above for derivative nucleicacid sequences. A % amino acid sequence identity value is determined bythe number of matching identical amino acids divided by the sequencelength for which the percent identity is being reported. “Percent (%)amino acid sequence similarity” is determined by doing the samecalculation as for determining % amino acid sequence identity, butincluding conservative amino acid substitutions in addition to identicalamino acids in the computation. A conservative amino acid substitutionis one in which an amino acid is substituted for another amino acidhaving similar properties such that the folding or activity of theprotein is not significantly affected. Aromatic amino acids that can besubstituted for each other are phenylalanine, tryptophan, and tyrosine;interchangeable hydrophobic amino acids are leucine, isoleucine,methionine, and valine; interchangeable polar amino acids are glutamineand asparagine; interchangeable basic amino acids arginine, lysine andhistidine; interchangeable acidic amino acids aspartic acid and glutamicacid; and interchangeable small amino acids alanine, serine, cystine,threonine, and glycine.

[0049] In one preferred embodiment, a p53 protein derivative shares atleast 50% sequence identity or similarity, preferably at least 60%, 70%,or 80% sequence identity or similarity, more preferably at least 85%sequence similarity or identity, still more preferably at least 90%sequence similarity or identity, and most preferably at least 95%sequence identity or similarity with a contiguous stretch of at least 10amino acids, preferably at least 25 amino acids, more preferably atleast 40 amino acids, still more preferably at least 50 amino acids,more preferably at least 100 amino acids, and in some cases, the entirelength of any one of SEQ ID NOs:2, 4, 6, 8, or 10. Further preferredderivatives share these % sequence identities with the domains of SEQ IDNOs 2, 4 and 6 listed in Table I above. Additional preferred derivativescomprise a sequence that shares 100% similarity with any contiguousstretch of at least 10 amino acids, preferably at least 12, morepreferably at least 15, and most preferably at least 20 amino acids ofany of SEQ ID NOs 2, 4, 6, 8, and 10, and preferably functional domainsthereof. Further preferred fragments comprise at least 7 contiguousamino acids, preferably at least 9, more preferably at least 12, andmost preferably at least 17 contiguous amino acids of any of SEQ ID NOs2, 4, 6, 8, and 10, and preferably functional domains thereof.

[0050] Other preferred p53 polypeptides, fragments or derivativesconsist of or comprise a sequence selected from the group consisting ofRICSCPKRD, KICSCPKRD, RVCSCPKRD, KVCSCPKRD, RICTCPKRD, KICTCPKRD,RVCTCPKRD, and KVCTCPKRD (i.e. sequences of the formula: (R or K)(I orV)C(S or T)CPKRD). Additional preferred p53 polypeptides, fragments orderivatives, consist of or comprise a sequence selected from the groupconsisting of FXCKNSC and FXCQNSC, where X=any amino acid.

[0051] The fragment or derivative of any of the p53 proteins ispreferably “functionally active” meaning that the p53 protein derivativeor fragment exhibits one or more functional activities associated with afull-length, wild-type p53 protein comprising the amino acid sequence ofany of SEQ ID NOs:2, 4, 6, 8, or 10. As one example, a fragment orderivative may have antigenicity such that it can be used inimmunoassays, for immunization, for inhibition of p53 activity, etc, asdiscussed further below regarding generation of antibodies to p53proteins. Preferably, a functionally active p53 fragment or derivativeis one that displays one or more biological activities associated withp53 proteins such as regulation of the cell cycle, or transcriptioncontrol. The functional activity of p53 proteins, derivatives andfragments can be assayed by various methods known to one skilled in theart (Current Protocols in Protein Science (1998) Coligan et al., eds.,John Wiley & Sons, Inc., Somerset, N.J.). Example 12 below describes avariety of suitable assays for assessing p53 function.

[0052] P 53 derivatives can be produced by various methods known in theart. The manipulations which result in their production can occur at thegene or protein level. For example, a cloned p53 gene sequence can becleaved at appropriate sites with restriction endonuclease(s) (Wells etal., Philos. Trans. R. Soc. London SerA (1986) 317:415), followed byfurther enzymatic modification if desired, isolated, and ligated invitro, and expressed to produce the desired derivative. Alternatively, ap53 gene can be mutated in vitro or in vivo, to create and/or destroytranslation, initiation, and/or termination sequences, or to createvariations in coding regions and/or to form new restriction endonucleasesites or destroy preexisting ones, to facilitate further in vitromodification. A variety of mutagenesis techniques are known in the artsuch as chemical mutagenesis, in vitro site-directed mutagenesis (Carteret al., Nucl. Acids Res. (1986) 13:4331), use of TAB® linkers (availablefrom Pharmacia and Upjohn, Kalamazoo, Mich.), etc.

[0053] At the protein level, manipulations include post translationalmodification, e.g. glycosylation, acetylation, phosphorylation,amidation, derivatization by known protecting/blocking groups,proteolytic cleavage, linkage to an antibody molecule or other cellularligand, etc. Any of numerous chemical modifications may be carried outby known technique (e.g. specific chemical cleavage by cyanogen bromide,trypsin, chymotrypsin, papain, V8 protease, NaBH₄, acetylation,formylation, oxidation, reduction, metabolic synthesis in the presenceof tunicamycin, etc.). Derivative proteins can also be chemicallysynthesized by use of a peptide synthesizer, for example to introducenonclassical amino acids or chemical amino acid analogs as substitutionsor additions into the p53 protein sequence.

[0054] Chimeric or fusion proteins can be made comprising a p53 proteinor fragment thereof (preferably comprising one or more structural orfunctional domains of the p53 protein) joined at its N- or C-terminusvia a peptide bond to an amino acid sequence of a different protein. Achimeric product can be made by ligating the appropriate nucleic acidsequences encoding the desired amino acid sequences to each other in theproper coding frame using standard methods and expressing the chimericproduct. A chimeric product may also be made by protein synthetictechniques, e.g. by use of a peptide synthesizer.

[0055] p33 and Rb Proteins

[0056] The invention also provides amino acid sequences for Drosophilap33 (SEQ ID NO:20), and Rb (SEQ ID NO:22) tumor suppressors. Derivativesand fragments of these sequences can be prepared as described above forthe p53 protein sequences. Preferred fragments and derivatives comprisethe same number of contiguous amino acids or same degrees of percentidentity or similarity as described above for p53 amino acid sequences.

[0057] p53 Gene Regulatory Elements

[0058] p53 gene regulatory DNA elements, such as enhancers or promotersthat reside within the 5′ UTRs of SEQ ID NOs 1, 3, and 5, as shown inTable I above, or within nucleotides 1-1225 of SEQ ID NO:18, can be usedto identify tissues, cells, genes and factors that specifically controlp53 protein production. Preferably at least 20, more preferably at least25, and most preferably at least 50 contiguous nucleotides within the 5′UTRs are used. Analyzing components that are specific to p53 proteinfunction can lead to an understanding of how to manipulate theseregulatory processes, for either pesticide or therapeutic applications,as well as an understanding of how to diagnose dysfunction in theseprocesses.

[0059] Gene fusions with the p53 regulatory elements can be made. Forcompact genes that have relatively few and small intervening sequences,such as those described herein for Drosophila, it is typically the casethat the regulatory elements that control spatial and temporalexpression patterns are found in the DNA immediately upstream of thecoding region, extending to the nearest neighboring gene. Regulatoryregions can be used to construct gene fusions where the regulatory DNAsare operably fused to a coding region for a reporter protein whoseexpression is easily detected, and these constructs are introduced astransgenes into the animal of choice. An entire regulatory DNA regioncan be used, or the regulatory region can be divided into smallersegments to identify sub-elements that might be specific for controllingexpression a given cell type or stage of development. One suitablemethod to decipher regions containing regulatory sequences is by an invitro CAT assay (Mercer, Crit. Rev. Euk. Gene Exp. (1992) 2:251-263;Sambrook et al., supra; and Gorman et al., Mol. Cell. Biol. (1992)2:1044-1051). Additional reporter proteins that can be used forconstruction of these gene fusions include E. coli beta-galactosidaseand green fluorescent protein (GFP). These can be detected readily insitu, and thus are useful for histological studies and can be used tosort cells that express p53 proteins (O'Kane and Gehring PNAS (1987)84(24):9123-9127; Chalfie et al., Science (1994) 263:802-805; andCumberledge and Krasnow (1994) Methods in Cell Biology 44:143-159).Recombinase proteins, such as FLP or cre, can be used in controllinggene expression through site-specific recombination (Golic and Lindquist(1989) Cell 59(3):499-509; White et al., Science (1996) 271:805-807).Toxic proteins such as the reaper and hid cell death proteins, areuseful to specifically ablate cells that normally express p53 proteinsin order to assess the physiological function of the cells (Kingston, InCurrent Protocols in Molecular Biology (1998) Ausubel et al., John Wiley& Sons, Inc. sections 12.0.3-12.10) or any other protein where it isdesired to examine the function this particular protein specifically incells that synthesize p53 proteins.

[0060] Alternatively, a binary reporter system can be used, similar tothat described further below, where the p53 regulatory element isoperably fused to the coding region of an exogenous transcriptionalactivator protein, such as the GAL4 or tTA activators described below,to create a p53 regulatory element “driver gene”. For the other half ofthe binary system the exogenous activator controls a separate “targetgene” containing a coding region of a reporter protein operably fused toa cognate regulatory element for the exogenous activator protein, suchas UASG or a tTA-response element, respectively. An advantage of abinary system is that a single driver gene construct can be used toactivate transcription from preconstructed target genes encodingdifferent reporter proteins, each with its own uses as delineated above.

[0061] p53 regulatory element-reporter gene fusions are also useful fortests of genetic interactions, where the objective is to identify thosegenes that have a specific role in controlling the expression of p53genes, or promoting the growth and differentiation of the tissues thatexpresses the p53 protein. p53 gene regulatory DNA elements are alsouseful in protein-DNA binding assays to identify gene regulatoryproteins that control the expression of p53 genes. The gene regulatoryproteins can be detected using a variety of methods that probe specificprotein-DNA interactions well known to those skilled in the art(Kingston, supra) including in vivo footprinting assays based onprotection of DNA sequences from chemical and enzymatic modificationwithin living or permeabilized cells; and in vitro footprinting assaysbased on protection of DNA sequences from chemical or enzymaticmodification using protein extracts, nitrocellulose filter-bindingassays and gel electrophoresis mobility shift assays using radioactivelylabeled regulatory DNA elements mixed with protein extracts. Candidatep53 gene regulatory proteins can be purified using a combination ofconventional and DNA-affinity purification techniques. Molecular cloningstrategies can also be used to identify proteins that specifically bindp53 gene regulatory DNA elements. For example, a Drosophila cDNA libraryin an expression vector, can be screened for cDNAs that encode p53 generegulatory element DNA-binding activity. Similarly, the yeast“one-hybrid” system can be used (Li and Herskowitz, Science (1993)262:1870-1874; Luo et al., Biotechniques (1996) 20(4):564-568; Vidal etal., PNAS (1996) 93(19):10315-10320).

[0062] Assays for Tumor Suppressor Genes

[0063] The p53 tumor suppressor gene encodes a transcription factorimplicated in regulation of cell proliferation, control of the cellcycle, and induction of apoptosis. Various experimental methods may beused to assess the role of the insect p53 genes in each of these areas.

[0064] Transcription Activity Assays

[0065] Due to its acidic region, wild type p53 binds both specificallyand non-specifically to DNA in order to mediate its function (Zambettiand Levine, supra). Transcriptional regulation by the p53 protein or itsfragments may be examined by any method known in the art. Anelectrophoretic mobility shift assay can be used to characterize DNAsequences to which p53 binds, and thus can assist in the identificationof genes regulated by p53. Briefly, cells are grown and transfected withvarious amounts of wild type or mutated transcription factor of interest(in this case, p53), harvested 48 hr after transfection, and lysed toprepare nuclear extracts. Preparations of Drosophila nuclear extractsfor use in mobility shift assays may be done as described in Dignam etal., Nucleic Acids Res. (1983) 11: 1475-1489. Additionally,complementary, single-stranded oligonucleotides corresponding to targetsequences for binding are synthesized and self-annealed to a finalconcentration of 10-15 ng/μl. Double stranded DNA is verified by gelelectrophoretic analysis (e.g., on a 7% polyacrylamide gel, by methodsknown in the art), and end-labeled with 20 μCi [32P] γ-dATP. The nuclearextracts are mixed with the double stranded target sequences underconditions conducive for binding and the results are analyzed bypolyacrylamide gel electrophoresis.

[0066] Another suitable method to determine DNA sequences to which p53binds is by DNA footprinting (Schmitz et al, Nucleic Acids Research(1978) 5:3157-3170).

[0067] Apoptosis Assays

[0068] A variety of methods may be used to examine apoptosis. One methodis the terminal deoxynucleotidyl transferase-mediateddigoxigenin-11-dUTP nick end labeling (TUNEL) assay which measures thenuclear DNA fragmentation characteristic of apoptosis (Lazebnik et al.,Nature (1994) 371:346-347; White et al., Science (1994) 264:677-683).Additionally, commercial kits can be used for detection of apoptosis(ApoAlert® available from Clontech (Palo Alto, Cailf.).

[0069] Apoptosis may also be assayed by a variety of staining methods.Acridine orange can be used to detect apoptosis in cultured cells (Lucaset al., Blood (1998) 15:4730-41) and in intact Drosophila tissues, whichcan also be stained with Nile Blue (Abrams et al., Development (1993)117:29-43). Another assay that can be used to detect DNA ladderingemploys ethidium bromide staining and electophoresis of DNA on anagarose gel (Civielli et al., Int. J. Cancer (1995) 27:673-679; Young,J. Biol. Chem. (1998) 273:25198-25202).

[0070] Proliferation and Cell Cycle Assays

[0071] Proliferating cells may be identified by bromodeoxyuridine (BRDU)incorporation into cells undergoing DNA synthesis and detection by ananti-BRDU antibody (Hoshino et al., Int. J. Cancer (1986) 38:369;Campana et al., J. Immunol. Meth. (1988) 107:79). This assay can be usedto reproducibly identify S-phase cells in Drosophila embryos (Edgar andO'Farrell, Cell (1990) 62:469-480) and imaginal discs (Secombe et al.,Genetics (1998) 149:1867-1882). S-phase DNA syntheses can also bequantified by measuring [³H]-thymidine incorporation using ascintillation counter (Chen, Oncogene (1996) 13:1395-403; Jeoung, J.Biol. Chem. (1995) 270:18367-73). Cell proliferation may be measured bycounting samples of a cell population over time, for example using ahemacytometer and Trypan-blue staining.

[0072] The DNA content and/or mitotic index of the cells may be measuredbased on the DNA ploidy value of the cell using a variety of methodsknown in the art such as a propidum iodide assay (Turner et al.,Prostate (1998) 34:175-81) or Feulgen staining using a computerizedmicrodensitometry staining system (Bacus, Am. J. Pathol.(1989)135:783-92).

[0073] The effect of p53 overexpression or loss-of-function onDrosophila cell proliferation can be assayed in vivo using an assay inwhich clones of cells with altered gene expression are generated in thedeveloping wing disc of Drosophila (Neufeld et al., Cell (1998)93:1183-93). The clones coexpress GFP, which allows the size and DNAcontent of the mutant and wild-type cells from dissociated discs to becompared by FACS analysis.

[0074] Tumor Formation and Transformation Assays

[0075] A variety of in vivo and in vitro tumor formation assays areknown in the art that can be used to assay p53 function. Such assays canbe used to detect foci formation (Beenken, J. Surg. Res. (1992)52:401-5), in vitro transformation (Ginsberg, Oncogene. (1991)6:669-72), tumor formation in nude mice (Endlich, Int. J. Radiat. Biol.(1993) 64:715-26), tumor formation in Drosophila (Tao et al., Nat.Genet. (1999) 21:177-181), and anchorage-independent growth in soft agar(Endlich, supra). Loss of indicia of differentiation may be indicatetransformation, including loss of differentiation markers, cellrounding, loss of adhesion, loss of polarity, loss of contactinhibition, loss of anchorage dependence, protease release, increasedsugar transport, decreased serum requirement, and expression of fetalantigens.

[0076] Generation and Genetic Analysis of Animals and Cell Lines withAltered Expression of p53 Gene

[0077] Both genetically modified animal models (i.e. in vivo models),such as C. elegans and Drosophila, and in vitro models such asgenetically engineered cell lines expressing or mis-expressing p53genes, are useful for the functional analysis of these proteins. Modelsystems that display detectable phenotypes, can be used for theidentification and characterization of p53 genes or other genes ofinterest and/or phenotypes associated with the mutation ormis-expression of p53. The term “mis-expression” as used hereinencompasses mis-expression due to gene mutations. Thus, a mis-expressedp53 protein may be one having an amino acid sequence that differs fromwild-type (i.e. it is a derivative of the normal protein). Amis-expressed p53 protein may also be one in which one or more N- orC-terminal amino acids have been deleted, and thus is a “fragment” ofthe normal protein. As used herein, “mis-expression” also includesectopic expression (e.g. by altering the normal spatial or temporalexpression), over-expression (e.g. by multiple gene copies),underexpression, non-expression (e.g. by gene knockout or blockingexpression that would otherwise normally occur), and further, expressionin ectopic tissues.

[0078] The in vivo and in vitro models may be genetically engineered ormodified so that they 1) have deletions and/or insertions of a p53genes, 2) harbor interfering RNA sequences derived from a p53 gene, 3)have had an endogenous p53 gene mutated (e.g. contain deletions,insertions, rearrangements, or point mutations in the p53 gene), and/or4) contain transgenes for mis-expression of wild-type or mutant forms ofa p53 gene. Such genetically modified in vivo and in vitro models areuseful for identification of genes and proteins that are involved in thesynthesis, activation, control, etc. of p53, and also downstreameffectors of p53 function, genes regulated by p53, etc. The modelsystems can be used for testing potential pharmaceutical and pesticidalcompounds that interact with p53, for example by administering thecompound to the model system using any suitable method (e.g. directcontact, ingestion, injection, etc.) and observing any changes inphenotype, for example defective movement, lethality, etc. Variousgenetic engineering and expression modification methods which can beused are well-known in the art, including chemical mutagenesis,transposon mutagenesis, antisense RNAi, dsRNAi, and transgene-mediatedmis-expression.

[0079] Generating Loss-of-function Mutations by Mutagenesis

[0080] Loss-of-function mutations in an insect p53 gene can be generatedby any of several mutagenesis methods known in the art (Ashburner, InDrosophila melanogaster: A Laboratory Manual (1989), Cold Spring Harbor,N.Y., Cold Spring Harbor Laboratory Press: pp. 299-418; Fly pushing: TheTheory and Practice of Drosophila melanogaster Genetics (1997) ColdSpring Harbor Press, Plainview, N.Y., hereinafter “Fly Pushing”).Techniques for producing mutations in a gene or genome include use ofradiation (e.g., X-ray, UV, or gamma ray); chemicals (e.g., EMS, MMS,ENU, formaldehyde, etc.); and insertional mutagenesis by mobile elementsincluding dysgenesis induced by transposon insertions, ortransposon-mediated deletions, for example, male recombination, asdescribed below. Other methods of altering expression of genes includeuse of transposons (e.g., P element, EP-type “overexpression trap”element, mariner element, piggyBac transposon, hermes, minos, sleepingbeauty, etc.) to misexpress genes; antisense; double-stranded RNAinterference; peptide and RNA aptamers; directed deletions; homologousrecombination; dominant negative alleles; and intrabodies.

[0081] Transposon insertions lying adjacent to a p53 gene can be used togenerate deletions of flanking genomic DNA, which if induced in thegermline, are stably propagated in subsequent generations. The utilityof this technique in generating deletions has been demonstrated and iswell-known in the art. One version of the technique using collections ofP element transposon induced recessive lethal mutations (P lethals) isparticularly suitable for rapid identification of novel, essential genesin Drosophila (Cooley et al., Science (1988) 239:1121-1128; Spralding etal., PNAS (1995) 92:0824-10830). Since the sequence of the P elementsare known, the genomic sequence flanking each transposon insert isdetermined either by plasmid rescue (Hamilton et al., PNAS (1991)88:2731-2735) or by inverse polymerase chain reaction (Rehm,http://www.fruitfly.org/methods/). A more recent version of thetransposon insertion technique in male Drosophila using P elements isknown as P-mediated male recombination (Preston and Engels, Genetics(1996) 144:1611-1638).

[0082] Generating Loss-of-function Phenotypes Using RNA-based Methods

[0083] p53 genes may be identified and/or characterized by generatingloss-of-function phenotypes in animals of interest through RNA-basedmethods, such as antisense RNA (Schubiger and Edgar, Methods in CellBiology (1994) 44:697-713). One form of the antisense RNA methodinvolves the injection of embryos with an antisense RNA that ispartially homologous to the gene of interest (in this case the p53gene). Another form of the antisense RNA method involves expression ofan antisense RNA partially homologous to the gene of interest byoperably joining a portion of the gene of interest in the antisenseorientation to a powerful promoter that can drive the expression oflarge quantities of antisense RNA, either generally throughout theanimal or in specific tissues. Antisense RNA-generated loss-of-functionphenotypes have been reported previously for several Drosophila genesincluding cactus, pecanex, and Krüppel (LaBonne et al., Dev. Biol.(1989) 136(1):1-16; Schuh and Jackle, Genome (1989) 31(1):422-425;Geisler et al., (1992) 71(4):613-621).

[0084] Loss-of-function phenotypes can also be generated bycosuppression methods (Bingham, Cell (1997) 90(3):385-387; Smyth, Curr.Biol. (1997) 7(12):793-795; Que and Jorgensen, Dev. Genet. (1998)22(1):100-109). Cosuppression is a phenomenon of reduced gene expressionproduced by expression or injection of a sense strand RNA correspondingto a partial segment of the gene of interest. Cosuppression effects havebeen employed extensively in plants and C. elegans to generateloss-of-function phenotypes. Cosuppression in Drosophila has been shown,where reduced expression of the Adh gene was induced from a white-Adhtransgene (Pal-Bhadra et al., Cell (1997) 90(3):479-490).

[0085] Another method for generating loss-of-function phenotypes is bydouble-stranded RNA interference (dsRNAi). This method is based on theinterfering properties of double-stranded RNA derived from the codingregions of gene, and has proven to be of great utility in geneticstudies of C. elegans (Fire et al., Nature (1998) 391:806-811), and canalso be used to generate loss-of-function phenotypes in Drosophila(Kennerdell and Carthew, Cell (1998) 95:1017-1026; Misquitta andPatterson PNAS (1999) 96:1451-1456). Complementary sense and antisenseRNAs derived from a substantial portion of a gene of interest, such asp53 gene, are synthesized in vitro, annealed in an injection buffer, andintroduced into animals by injection or other suitable methods such asby feeding, soaking the animals in a buffer containing the RNA, etc.Progeny of the dsRNA treated animals are then inspected for phenotypesof interest (PCT publication no. WO99/32619).

[0086] dsRNAi can also be achieved by causing simultaneous expression invivo of both sense and antisense RNA from appropriately positionedpromoters operably fused to p53 sequences. Alternatively, the livingfood of an animal can be engineered to express sense and antisense RNA,and then fed to the animal. For example, C. elegans can be fedengineered E. coli, Drosophila can be fed engineered baker's yeast, andinsects such as Leptinotarsa and Heliothis and other plant-eatinganimals can be fed transgenic plants engineered to produce the dsRNA.

[0087] RNAi has also been successfully used in cultured Drosophila cellsto inhibit expression of targeted proteins (Dixon lab, University ofMichigan,http://dixonlab.biochem.med.umich.edu/protocols/RNAiExperiments.html).Thus, cell lines in culture can be manipulated using RNAi both toperturb and study the function of p53 pathway components and to validatethe efficacy of therapeutic or pesticidal strategies which involve themanipulation of this pathway. A suitable protocol is described inExample 13.

[0088] Generating Loss-of-function Phenotypes Using Peptide and RNAAptamers

[0089] Another method for generating loss-of-function phenotypes is bythe use of peptide aptamers, which are peptides or small polypeptidesthat act as dominant inhibitors of protein function. Peptide aptamersspecifically bind to target proteins, blocking their function ability(Kolonin and Finley, PNAS (1998) 95:14266-14271). Due to the highlyselective nature of peptide aptamers, they may be used not only totarget a specific protein, but also to target specific functions of agiven protein (e.g. transcription function). Further, peptide aptamersmay be expressed in a controlled fashion by use of promoters whichregulate expression in a temporal, spatial or inducible manner. Peptideaptamers act dominantly; therefore, they can be used to analyze proteinsfor which loss-of-function mutants are not available.

[0090] Peptide aptamers that bind with high affinity and specificity toa target protein may be isolated by a variety of techniques known in theart. In one method, they are isolated from random peptide libraries byyeast two-hybrid screens (Xu et al., PNAS (1997) 94:12473-12478). Theycan also be isolated from phage libraries (Hoogenboom et al.,Immunotechnology (1998) 4:1-20) or chemically generatedpeptides/libraries.

[0091] RNA aptamers are specific RNA ligands for proteins, that canspecifically inhibit protein function of the gene (Good et al., GeneTherapy (1997) 4:45-54; Ellington. et al., Biotechnol. Annu. Rev. (1995)1:185-214). In vitro selection methods can be used to identify RNAaptamers having a selected specificity (Bell et al., J. Biol. Chem.(1998) 273:14309-14314). It has been demonstrated that RNA aptamers caninhibit protein function in Drosophila (Shi et al., Proc. Natl. Acad.Sci USA (19999) 96:10033-10038). Accordingly, RNA aptamers can be usedto decrease the expression of p53 protein or derivative thereof, or aprotein that interacts with the p53 protein.

[0092] Transgenic animals can be generated to test peptide or RNAaptamers in vivo (Kolonin and Finley, supra). For example, transgenicDrosophila lines expressing the desired aptamers may be generated by Pelement mediated transformation (discussed below). The phenotypes of theprogeny expressing the aptamers can then be characterized.

[0093] Generating Loss of Function Phenotypes Using Intrabodies

[0094] Intracellularly expressed antibodies, or intrabodies, aresingle-chain antibody molecules designed to specifically bind andinactivate target molecules inside cells. Intrabodies have been used incell assays and in whole organisms such as Drosophila (Chen et al., Hum.Gen. Ther. (1994) 5:595-601; Hassanzadeh et al., Febs Lett. (1998) 16(1,2):75-80 and 81-86). Inducible expression vectors can be constructedwith intrabodies that react specifically with p53 protein. These vectorscan be introduced into model organisms and studied in the same manner asdescribed above for aptamers.

[0095] Transgenesis

[0096] Typically, transgenic animals are created that contain genefusions of the coding regions of the p53 gene (from either genomic DNAor cDNA) or genes engineered to encode antisense RNAs, cosuppressionRNAs, interfering dsRNA, RNA aptamers, peptide aptamers, or intrabodiesoperably joined to a specific promoter and transcriptional enhancerwhose regulation has been well characterized, preferably heterologouspromoters/enhancers (i.e. promoters/enhancers that are non-native to thep53 genes being expressed).

[0097] Methods are well known for incorporating exogenous nucleic acidsequences into the genome of animals or cultured cells to createtransgenic animals or recombinant cell lines. For invertebrate animalmodels, the most common methods involve the use of transposableelements. There are several suitable transposable elements that can beused to incorporate nucleic acid sequences into the genome of modelorganisms. Transposable elements are also particularly useful forinserting sequences into a gene of interest so that the encoded proteinis not properly expressed, creating a “knock-out” animal having aloss-of-function phenotype. Techniques are well-established for the useof P element in Drosophila (Rubin and Spradling, Science (1982)218:348-53; U.S. Pat. No. 4,670,388). Additionally, transposableelements that function in a variety of species, have been identified,such as PiggyBac (Thibault et al., Insect Mol Biol (1999) 8(1):119-23),hobo, and hermes.

[0098] P elements, or marked P elements, are preferred for the isolationof loss-of-function mutations in Drosophila p53 genes because of theprecise molecular mapping of these genes, depending on the availabilityand proximity of preexisting P element insertions for use as a localizedtransposon source (Hamilton and Zinn, Methods in Cell Biology (1994)44:81-94; and Wolfner and Goldberg, Methods in Cell Biology (1994)44:33-80). Typically, modified P elements are used which contain one ormore elements that allow detection of animals containing the P element.Most often, marker genes are used that affect the eye color ofDrosophila, such as derivatives of the Drosophila white or rosy genes(Rubin and Spradling, supra; and Klemenz et al., Nucleic Acids Res.(1987) 15(10):3947-3959). However, in principle, any gene can be used asa marker that causes a reliable and easily scored phenotypic change intransgenic animals. Various other markers include bacterial plasmidsequences having selectable markers such as ampicillin resistance(Steller and Pirrotta, EMBO. J. (1985) 4:167-171); and lacZ sequencesfused to a weak general promoter to detect the presence of enhancerswith a developmental expression pattern of interest (Bellen et al.,Genes Dev. (1989) 3(9):1288-1300). Other examples of marked P elementsuseful for mutagenesis have been reported (Nucleic Acids Research (1998)26:85-88; and http://flybase.bio.indiana.edu).

[0099] A preferred method of transposon mutagenesis in Drosophilaemploys the “local hopping” method (Tower et al. (Genetics (1993)133:347-359). Each new P insertion line can be tested molecularly fortransposition of the P element into the gene of interest (e.g. p53) byassays based on PCR. For each reaction, one PCR primer is used that ishomologous to sequences contained within the P element and a secondprimer is homologous to the coding region or flanking regions of thegene of interest. Products of the PCR reactions are detected by agarosegel electrophoresis. The sizes of the resulting DNA fragments reveal thesite of P element insertion relative to the gene of interest.Alternatively, Southern blotting and restriction mapping using DNAprobes derived from genomic DNA or cDNAs of the gene of interest can beused to detect transposition events that rearrange the genomic DNA ofthe gene. P transposition events that map to the gene of interest can beassessed for phenotypic effects in heterozygous or homozygous mutantDrosophila.

[0100] In another embodiment, Drosophila lines carrying P insertions inthe gene of interest, can be used to generate localized deletions usingknown methods (Kaiser, Bioassays (1990) 12(6):297-301; Harnessing thepower of Drosophila genetics, In Drosophila melanogaster: Practical Usesin Cell and Molecular Biology, Goldstein and Fyrberg, Eds., AcademicPress, Inc. San Diego, Calif.). This is particularly useful if no Pelement transpositions are found that disrupt the gene of interest.Briefly, flies containing P elements inserted near the gene of interestare exposed to a further round of transposase to induce excision of theelement. Progeny in which the transposon has excised are typicallyidentified by loss of the eye color marker associated with thetransposable element. The resulting progeny will include flies witheither precise or imprecise excision of the P element, where theimprecise excision events often result in deletion of genomic DNAneighboring the site of P insertion. Such progeny are screened bymolecular techniques to identify deletion events that remove genomicsequence from the gene of interest, and assessed for phenotypic effectsin heterozygous and homozygous mutant Drosophila.

[0101] Recently a transgenesis system has been described that may haveuniversal applicability in all eye-bearing animals and which has beenproven effective in delivering transgenes to diverse insect species(Berghammer et al., Nature (1999) 402:370-371). This system includes: anartificial promoter active in eye tissue of all animal species,preferably containing three Pax6 binding sites positioned upstream of aTATA box (3xP3; Sheng et al. Genes Devel. (1997) 11:1122-1131); a strongand visually detectable marker gene, such as GFP or or otherautofluorescent protein genes (Pasher et al., Gene (1992) 111:229-233;U.S. Pat. No. 5,491,084); and promiscuous vectors capable of deliveringtransgenes to a broad range of animal species, for exampletransposon-based vectors derived from Hermes, PiggyBac, or mariner, orvectors based on pantropic VSVG-pseudotyped retroviruses (Burns et al.,In Vitro Cell Dev Biol Anim (1996) 32:78-84; Jordan et al., Insect MolBiol (1998) 7: 215-222; U.S. Pat. No. 5,670,345). Since the sametransgenesis system can be used in a variety of phylogenetically diverseanimals, comparative functional studies are greatly facilitated, whichis especially helpful in evaluating new applications to pest management.

[0102] In addition to creating loss-of-function phenotypes, transposableelements can be used to incorporate p53, or fragments or derivativesthereof, as an additional gene into any region of an animal's genomeresulting in mis-expression (including over-expression) of the gene. Apreferred vector designed specifically for misexpression of genes intransgenic Drosophila, is derived from pGMR (Hay et al., Development(1994) 120:2121-2129), is 9 Kb long, and contains: an origin ofreplication for E. coli; an ampicillin resistance gene; P elementtransposon 3′ and 5′ ends to mobilize the inserted sequences; a Whitemarker gene; an expression unit comprising the TATA region of hsp70enhancer and the 3′untranslated region of α-tubulin gene. The expressionunit contains a first multiple cloning site (MCS) designed for insertionof an enhancer and a second MCS located 500 bases downstream, designedfor the insertion of a gene of interest. As an alternative totransposable elements, homologous recombination or gene targetingtechniques can be used to substitute a heterologous p53 gene or fragmentor derivative for one or both copies of the animal's homologous gene.The transgene can be under the regulation of either an exogenous or anendogenous promoter element, and be inserted as either a minigene or alarge genomic fragment. Gene function can be analyzed by ectopicexpression, using, for example, Drosophila (Brand et al., Methods inCell Biology (1994) 44:635-654).

[0103] Examples of well-characterized heterologous promoters that may beused to create transgenic Drosophila include heat shockpromoters/enhancers such as the hsp70 and hsp83 genes. Eye tissuespecific promoters/enhancers include eyeless (Mozer and Benzer,Development (1994) 120:1049-1058), sevenless (Bowtell et al., PNAS(1991) 88(15):6853-6857), and glass-responsive promoters/enhancers(Quiring et al., Science (1994) 265:785-789). Wing tissue specificenhancers/promoters can be derived from the dpp or vestigal genes(Staehling-Hampton et al., Cell Growth Differ. (1994) 5(6):585-593; Kimet al., Nature (1996) 382:133-138). Finally, where it is necessary torestrict the activity of dominant active or dominant negative transgenesto regions where p53 is normally active, it may be useful to useendogenous p53 promoters. The ectopic expression of DMp53 in Drosophilalarval eye using glass-responsive enhancer elements is described inExample 12 below.

[0104] In Drosophila, binary control systems that employ exogenous DNAare useful when testing the mis-expression of genes in a wide variety ofdevelopmental stage-specific and tissue-specific patterns. Two examplesof binary exogenous regulatory systems include the UAS/GAL4 system fromyeast (Hay et al., PNAS (1997) 94(10):5195-5200; Ellis et al.,Development (1993) 119(3):855-865), and the “Tet system” derived from E.coli (Bello et al., Development (1998) 125:2193-2202). The UAS/GAL4system is a well-established and powerful method of mis-expression whichemploys the UAS_(G) upstream regulatory sequence for control ofpromoters by the yeast GAL4 transcriptional activator protein (Brand andPerrimon, Development (1993) 118(2):401-15). In this approach,transgenic Drosophila, termed “target” lines, are generated where thegene of interest to be mis-expressed is operably fused to an appropriatepromoter controlled by UAS_(G). Other transgenic Drosophila strains,termed “driver” lines, are generated where the GAL4 coding region isoperably fused to promoters/enhancers that direct the expression of theGAL4 activator protein in specific tissues, such as the eye, wing,nervous system, gut, or musculature. The gene of interest is notexpressed in the target lines for lack of a transcriptional activator todrive transcription from the promoter joined to the gene of interest.However, when the UAS-target line is crossed with a GAL4 driver line,mis-expression of the gene of interest is induced in resulting progenyin a specific pattern that is characteristic for that GAL4 line. Thetechnical simplicity of this approach makes it possible to sample theeffects of directed mis-expression of the gene of interest in a widevariety of tissues by generating one transgenic target line with thegene of interest, and crossing that target line with a panel ofpre-existing driver lines.

[0105] In the “Tet” binary control system, transgenic Drosophila driverlines are generated where the coding region for atetracycline-controlled transcriptional activator (tTA) is operablyfused to promoters/enhancers that direct the expression of tTA in atissue-specific and/or developmental stage-specific manner. The driverlines are crossed with transgenic Drosophila target lines where thecoding region for the gene of interest to be mis-expressed is operablyfused to a promoter that possesses a tTA-responsive regulatory element.When the resulting progeny are supplied with food supplemented with asufficient amount of tetracycline, expression of the gene of interest isblocked. Expression of the gene of interest can be induced at willsimply by removal of tetracycline from the food. Also, the level ofexpression of the gene of interest can be adjusted by varying the levelof tetracycline in the food. Thus, the use of the Tet system as a binarycontrol mechanism for mis-expression has the advantage of providing ameans to control the amplitude and timing of mis-expression of the geneof interest, in addition to spatial control. Consequently, if a p53 genehas lethal or deleterious effects when mis-expressed at an early stagein development, such as the embryonic or larval stages, the function ofthe gene in the adult can still be assessed by adding tetracycline tothe food during early stages of development and removing tetracyclinelater so as to induce mis-expression only at the adult stage.

[0106] Dominant negative mutations, by which the mutation causes aprotein to interfere with the normal function of a wild-type copy of theprotein, and which can result in loss-of-function or reduced-functionphenotypes in the presence of a normal copy of the gene, can be madeusing known methods (Hershkowitz, Nature (1987) 329:219-222). In thecase of active monomeric proteins, overexpression of an inactive form,achieved, for example, by linking the mutant gene to a highly activepromoter, can cause competition for natural substrates or ligandssufficient to significantly reduce net activity of the normal protein.Alternatively, changes to active site residues can be made to create avirtually irreversible association with a target.

[0107] Assays for Change in Gene Expression

[0108] Various expression analysis techniques may be used to identifygenes which are differentially expressed between a cell line or ananimal expressing a wild type p53 gene compared to another cell line oranimal expressing a mutant p53 gene. Such expression profilingtechniques include differential display, serial analysis of geneexpression (SAGE), transcript profiling coupled to a gene databasequery, nucleic acid array technology, subtractive hybridization, andproteome analysis (e.g. mass-spectrometry and two-dimensional proteingels). Nucleic acid array technology may be used to determine thegenome-wide expression pattern in a normal animal for comparison with ananimal having a mutation in the p53 gene. Gene expression profiling canalso be used to identify other genes or proteins that may have afunctional relation to p53. The genes are identified by detectingchanges in their expression levels following mutation, over-expression,under-expression, mis-expression or knock-out, of the p53 gene.

[0109] Phenotypes Associated with p53 Gene Mutations

[0110] After isolation of model animals carrying mutated ormis-expressed p53 genes or inhibitory RNAs, animals are carefullyexamined for phenotypes of interest. For analysis of p53 genes that havebeen mutated, animal models that are both homozygous and heterozygousfor the altered p53 gene are analyzed. Examples of specific phenotypesthat may be investigated include lethality; sterility; feeding behavior,tumor formation, perturbations in neuromuscular function includingalterations in motility, and alterations in sensitivity topharmaceuticals. Some phenotypes more specific to flies includealterations in: adult behavior such as, flight ability, walking,grooming, phototaxis, mating or egg-laying; alterations in the responsesof sensory organs, changes in the morphology, size or number of adulttissues such as, eyes, wings, legs, bristles, antennae, gut, fat body,gonads, and musculature; larval tissues such as mouth parts, cuticles,internal tissues or imaginal discs; or larval behavior such as feeding,molting, crawling, or puparian formation; or developmental defects inany germline or embryonic tissues.

[0111] Genomic sequences containing a p53 gene can be used to engineeran existing mutant insect line, using the transgenesis methodspreviously described, to determine whether the mutation is in the p53gene. Briefly, germline transformants are crossed for complementationtesting to an existing or newly created panel of insect lines whosemutations have been mapped to the vicinity of the gene of interest (FlyPushing, supra). If a mutant line is discovered to be rescued by thegenomic fragment, as judged by complementation of the mutant phenotype,then the mutant line likely harbors a mutation in the p53 gene. Thisprediction can be further confirmed by sequencing the p53 gene from themutant line to identify the lesion in the p53 gene.

[0112] Identification of Genes that Modify p53 Genes

[0113] The characterization of new phenotypes created by mutations ormisexpression in p53 genes enables one to test for genetic interactionsbetween p53 genes and other genes that may participate in the same,related, or interacting genetic or biochemical pathway(s). Individualgenes can be used as starting points in large-scale genetic modifierscreens as described in more detail below. Alternatively, RNAi methodscan be used to simulate loss-of-function mutations in the genes beinganalyzed. It is of particular interest to investigate whether there areany interactions of p53 genes with other well-characterized genes,particularly genes involved in regulation of the cell cycle orapoptosis.

[0114] Genetic Modifier Screens

[0115] A genetic modifier screen using invertebrate model organisms is aparticularly preferred method for identifying genes that interact withp53 genes, because large numbers of animals can be systematicallyscreened making it more possible that interacting genes will beidentified. In Drosophila, a screen of up to about 10,000 animals isconsidered to be a pilot-scale screen. Moderate-scale screens usuallyemploy about 10,000 to about 50,000 flies, and large-scale screensemploy greater than about 50,000 flies. In a genetic modifier screen,animals having a mutant phenotype due to a mutation in or misexpressionof the p53 gene are further mutagenized, for example by chemicalmutagenesis or transposon mutagenesis.

[0116] The procedures involved in typical Drosophila genetic modifierscreens are well-known in the art (Wolfner and Goldberg, Methods in CellBiology (1994) 44:33-80; and Karim et al., Genetics (1996) 143:315-329).The procedures used differ depending upon the precise nature of themutant allele being modified. If the mutant allele is geneticallyrecessive, as is commonly the situation for a loss-of-function allele,then most typically males, or in some cases females, which carry onecopy of the mutant allele are exposed to an effective mutagen, such asEMS, MMS, ENU, triethylamine, diepoxyalkanes, ICR-170, formaldehyde,X-rays, gamma rays, or ultraviolet radiation. The mutagenized animalsare crossed to animals of the opposite sex that also carry the mutantallele to be modified. In the case where the mutant allele beingmodified is genetically dominant, as is commonly the situation forectopically expressed genes, wild type males are mutagenized and crossedto females carrying the mutant allele to be modified.

[0117] The progeny of the mutagenized and crossed flies that exhibiteither enhancement or suppression of the original phenotype are presumedto have mutations in other genes, called “modifier genes”, thatparticipate in the same phenotype-generating pathway. These progeny areimmediately crossed to adults containing balancer chromosomes and usedas founders of a stable genetic line. In addition, progeny of thefounder adult are retested under the original screening conditions toensure stability and reproducibility of the phenotype. Additionalsecondary screens may be employed, as appropriate, to confirm thesuitability of each new modifier mutant line for further analysis.

[0118] Standard techniques used for the mapping of modifiers that comefrom a genetic screen in Drosophila include meiotic mapping with visibleor molecular genetic markers; male-specific recombination mappingrelative to P-element insertions; complementation analysis withdeficiencies, duplications, and lethal P-element insertions; andcytological analysis of chromosomal aberrations (Fly Pushing, supra).Genes corresponding to modifier mutations that fail to complement alethal P-element may be cloned by plasmid rescue of the genomic sequencesurrounding that P-element. Alternatively, modifier genes may be mappedby phenotype rescue and positional cloning (Sambrook et al., supra).

[0119] Newly identified modifier mutations can be tested directly forinteraction with other genes of interest known to be involved orimplicated with p53 genes using methods described above. Also, the newmodifier mutations can be tested for interactions with genes in otherpathways that are not believed to be related to regulation of cell cycleor apoptosis. New modifier mutations that exhibit specific geneticinteractions with other genes implicated in cell cycle regulation orapoptosis, and not with genes in unrelated pathways, are of particularinterest.

[0120] The modifier mutations may also be used to identify“complementation groups”. Two modifier mutations are considered to fallwithin the same complementation group if animals carrying both mutationsin trans exhibit essentially the same phenotype as animals that arehomozygous for each mutation individually and, generally are lethal whenin trans to each other (Fly Pushing, supra). Generally, individualcomplementation groups defined in this way correspond to individualgenes.

[0121] When p53 modifier genes are identified, homologous genes in otherspecies can be isolated using procedures based on cross-hybridizationwith modifier gene DNA probes, PCR-based strategies with primersequences derived from the modifier genes, and/or computer searches ofsequence databases. For therapeutic applications related to the functionof p53 genes, human and rodent homologs of the modifier genes are ofparticular interest.

[0122] Although the above-described Drosophila genetic modifier screensare quite powerful and sensitive, some genes that interact with p53genes may be missed in this approach, particularly if there isfunctional redundancy of those genes. This is because the vast majorityof the mutations generated in the standard mutagenesis methods will beloss-of-function mutations, whereas gain-of-function mutations thatcould reveal genes with functional redundancy will be relatively rare.Another method of genetic screening in Drosophila has been developedthat focuses specifically on systematic gain-of-function genetic screens(Rorth et al., Development (1998) 125:1049-1057). This method is basedon a modular mis-expression system utilizing components of the GAL4/UASsystem (described above) where a modified P element, termed an “enhancedP” (EP) element, is genetically engineered to contain a GAL4-responsiveUAS element and promoter. Any other transposons can also be used forthis system. The resulting transposon is used to randomly tag genes byinsertional mutagenesis (similar to the method of P element mutagenesisdescribed above). Thousands of transgenic Drosophila strains, termed EPlines, can be generated, each containing a specific UAS-tagged gene.This approach takes advantage of the preference of P elements to insertat the 5′-ends of genes. Consequently, many of the genes that are taggedby insertion of EP elements become operably fused to a GAL4-regulatedpromoter, and increased expression or mis-expression of the randomlytagged gene can be induced by crossing in a GAL4 driver gene.

[0123] Systematic gain-of-function genetic screens for modifiers ofphenotypes induced by mutation or mis-expression of a p53 gene can beperformed by crossing several thousand Drosophila EP lines individuallyinto a genetic background containing a mutant or mis-expressed p53 gene,and further containing an appropriate GAL4 driver transgene. It is alsopossible to remobilize the EP elements to obtain novel insertions. Theprogeny of these crosses are then analyzed for enhancement orsuppression of the original mutant phenotype as described above. Thoseidentified as having mutations that interact with the p53 gene can betested further to verify the reproducibility and specificity of thisgenetic interaction. EP insertions that demonstrate a specific geneticinteraction with a mutant or mis-expressed p53 gene, have a physicallytagged new gene which can be identified and sequenced using PCR orhybridization screening methods, allowing the isolation of the genomicDNA adjacent to the position of the EP element insertion.

[0124] Identification of Molecules that Interact with p53

[0125] A variety of methods can be used to identify or screen formolecules, such as proteins or other molecules, that interact with p53protein, or derivatives or fragments thereof. The assays may employpurified p53 protein, or cell lines or a model organism such asDrosophila that has been genetically engineered to express p53 protein.Suitable screening methodologies are well known in the art to test forproteins and other molecules that interact with a gene/protein ofinterest (see e.g., PCT International Publication No. WO 96/34099). Thenewly identified interacting molecules may provide new targets forpharmaceutical agents. Any of a variety of exogenous molecules, bothnaturally occurring and/or synthetic (e.g., libraries of small moleculesor peptides, or phage display libraries), may be screened for bindingcapacity. In a typical binding experiment, the p53 protein or fragmentis mixed with candidate molecules under conditions conducive to binding,sufficient time is allowed for any binding to occur, and assays areperformed to test for bound complexes. A variety of assays to findinteracting proteins are known in the art, for example,immunoprecipitation with an antibody that binds to the protein in acomplex followed by analysis by size fractionation of theimmunoprecipitated proteins (e.g. by denaturing or nondenaturingpolyacrylamide gel electrophoresis), Western analysis, non-denaturinggel electrophoresis, etc.

[0126] Two-hybrid Assay Systems

[0127] A preferred method for identifying interacting proteins is atwo-hybrid assay system or variation thereof (Fields and Song, Nature(1989) 340:245-246; U.S. Pat. No. 5,283,173; for review see Brent andFinley, Annu. Rev. Genet. (1997) 31:663-704). The most commonly usedtwo-hybrid screen system is performed using yeast. All systems sharethree elements: 1) a gene that directs the synthesis of a “bait” proteinfused to a DNA binding domain; 2) one or more “reporter” genes having anupstream binding site for the bait, and 3) a gene that directs thesynthesis of a “prey” protein fused to an activation domain thatactivates transcription of the reporter gene. For the screening ofproteins that interact with p53 protein, the “bait” is preferably a p53protein, expressed as a fusion protein to a DNA binding domain; and the“prey” protein is a protein to be tested for ability to interact withthe bait, and is expressed as a fusion protein to a transcriptionactivation domain. The prey proteins can be obtained from recombinantbiological libraries expressing random peptides.

[0128] The bait fusion protein can be constructed using any suitable DNAbinding domain, such as the E. coli LexA repressor protein, or the yeastGAL4 protein (Bartel et al., BioTechniques (1993) 14:920-924, Chasman etal., Mol. Cell. Biol. (1989) 9:4746-4749, Ma et al., Cell (1987)48:847-853; Ptashne et al., Nature (1990) 346:329-331). The prey fusionprotein can be constructed using any suitable activation domain such asGAL4, VP-16, etc. The preys may contain useful moieties such as nuclearlocalization signals (Ylikomi et al., EMBO J. (1992) 11:3681-3694;Dingwall and Laskey, Trends Biochem. Sci. Trends Biochem. Sci. (1991)16:479-481) or epitope tags (Allen et al., Trends Biochem. Sci. TrendsBiochem. Sci. (1995) 20:511-516) to facilitate isolation of the encodedproteins. Any reporter gene can be used that has a detectable phenotypesuch as reporter genes that allow cells expressing them to be selectedby growth on appropriate medium (e.g. HIS3, LEU2 described by Chien etal., PNAS (1991) 88:9572-9582; and Gyuris et al., Cell (1993)75:791-803). Other reporter genes, such as LacZ and GFP, allow cellsexpressing them to be visually screened (Chien et al., supra).

[0129] Although the preferred host for two-hybrid screening is theyeast, the host cell in which the interaction assay and transcription ofthe reporter gene occurs can be any cell, such as mammalian (e.g.monkey, mouse, rat, human, bovine), chicken, bacterial, or insect cells.Various vectors and host strains for expression of the two fusionprotein populations in yeast can be used (U.S. Pat. No. 5,468,614;Bartel et al., Cellular Interactions in Development (1993) Hartley, ed.,Practical Approach Series xviii, IRL Press at Oxford University Press,New York, N.Y., pp. 153-179; and Fields and Sternglanz, Trends InGenetics (1994) 10:286-292). As an example of a mammalian system,interaction of activation tagged VP16 derivatives with a GAL4-derivedbait drives expression of reporters that direct the synthesis ofhygromycin B phosphotransferase, chloramphenicol acetyltransferase, orCD4 cell surface antigen (Fearon et al., PNAS (1992) 89:7958-7962). Asanother example, interaction of VP16-tagged derivatives withGAL4-derived baits drives the synthesis of SV40 T antigen, which in turnpromotes the replication of the prey plasmid, which carries an SV40origin (Vasavada et al., PNAS (1991) 88:10686-10690).

[0130] Typically, the bait p53 gene and the prey library of chimericgenes are combined by mating the two yeast strains on solid or liquidmedia for a period of approximately 6-8 hours. The resulting diploidscontain both kinds of chimeric genes, i.e., the DNA-binding domainfusion and the activation domain fusion. Transcription of the reportergene can be detected by a linked replication assay in the case of SV40 Tantigen (Vasavada et al., supra) or using immunoassay methods (Alam andCook, Anal. Biochem. (1990)188:245-254). The activation of otherreporter genes like URA3, HIS3, LYS2, or LEU2 enables the cells to growin the absence of uracil, histidine, lysine, or leucine, respectively,and hence serves as a selectable marker. Other types of reporters aremonitored by measuring a detectable signal. For example, GFP and lacZhave gene products that are fluorescent and chromogenic, respectively.

[0131] After interacting proteins have been identified, the DNAsequences encoding the proteins can be isolated. In one method, theactivation domain sequences or DNA-binding domain sequences (dependingon the prey hybrid used) are amplified, for example, by PCR using pairsof oligonucleotide primers specific for the coding region of the DNAbinding domain or activation domain. If a shuttle (yeast to E. coli)vector is used to express the fusion proteins, the DNA sequencesencoding the proteins can be isolated by transformation of E. coli usingthe yeast DNA and recovering the plasmids from E. coli. Alternatively,the yeast vector can be isolated, and the insert encoding the fusionprotein subcloned into a bacterial expression vector, for growth of theplasmid in E. coli.

[0132] Antibodies and Immunoassay

[0133] p53 proteins encoded by any of SEQ ID NOs:2, 4, 6, 8, or 10 andderivatives and fragments thereof, such as those discussed above, may beused as an immunogen to generate monoclonal or polyclonal antibodies andantibody fragments or derivatives (e.g. chimeric, single chain, Fabfragments). For example, fragments of a p53 protein, preferably thoseidentified as hydrophilic, are used as immunogens for antibodyproduction using art-known methods such as by hybridomas; production ofmonoclonal antibodies in germ-free animals (PCT/US90/02545); the use ofhuman hybridomas (Cole et al., PNAS (1983) 80:2026-2030; Cole et al., inMonoclonal Antibodies and Cancer Therapy (1985) Alan R. Liss, pp.77-96), and production of humanized antibodies (Jones et al., Nature(1986) 321:522-525; U.S. Pat. No. 5,530,101). In a particularembodiment, p53 polypeptide fragments provide specific antigens and/orimmunogens, especially when coupled to carrier proteins. For example,peptides are covalently coupled to keyhole limpet antigen (KLH) and theconjugate is emulsified in Freund's complete adjuvant. Laboratoryrabbits are immunized according to conventional protocol and bled. Thepresence of specific antibodies is assayed by solid phase immunosorbentassays using immobilized corresponding polypeptide. Specific activity orfunction of the antibodies produced may be determined by convenient invitro, cell-based, or in vivo assays: e.g. in vitro binding assays, etc.Binding affinity may be assayed by determination of equilibriumconstants of antigen-antibody association (usually at least about 10⁷M⁻¹, preferably at least about 10⁸ M⁻¹, more preferably at least about10⁹ M⁻¹). Example 11 below further describes the generation ofanti-DMp53 antibodies.

[0134] Immunoassays can be used to identify proteins that interact withor bind to p53 protein. Various assays are available for testing theability of a protein to bind to or compete with binding to a wild-typep53 protein or for binding to an anti-p53 protein antibody. Suitableassays include radioimmunoassays, ELISA (enzyme linked immunosorbentassay), immunoradiometric assays, gel diffusion precipitin reactions,immunodiffusion assays, in situ immunoassays (e.g., using colloidalgold, enzyme or radioisotope labels), western blots, precipitationreactions, agglutination assays (e.g., gel agglutination assays,hemagglutination assays), complement fixation assays, immunofluorescenceassays, protein A assays, immunoelectrophoresis assays, etc.

[0135] Identification of Potential Drug Targets

[0136] Once new p53 genes or p53 interacting genes are identified, theycan be assessed as potential drug or pesticide targets using animalmodels such as Drosophila or other insects, or using cells that expressendogenous p53, or that have been engineered to express p53.

[0137] Assays of Compounds on Insects

[0138] Potential insecticidal compounds can be administered to insectsin a variety of ways, including orally (including addition to syntheticdiet, application to plants or prey to be consumed by the testorganism), topically (including spraying, direct application of compoundto animal, allowing animal to contact a treated surface), or byinjection. Insecticides are typically very hydrophobic molecules andmust commonly be dissolved in organic solvents, which are allowed toevaporate in the case of methanol or acetone, or at low concentrationscan be included to facilitate uptake (ethanol, dimethyl sulfoxide).

[0139] The first step in an insect assay is usually the determination ofthe minimal lethal dose (MLD) on the insects after a chronic exposure tothe compounds. The compounds are usually diluted in DMSO, and applied tothe food surface bearing 0-48 hour old embryos and larvae. In additionto MLD, this step allows the determination of the fraction of eggs thathatch, behavior of the larvae, such as how they move/feed compared tountreated larvae, the fraction that survive to pupate, and the fractionthat eclose (emergence of the adult insect from puparium). Based onthese results more detailed assays with shorter exposure times may bedesigned, and larvae might be dissected to look for obviousmorphological defects. Once the MLD is determined, more specific acuteand chronic assays can be designed.

[0140] In a typical acute assay, compounds are applied to the foodsurface for embryos, larvae, or adults, and the animals are observedafter 2 hours and after an overnight incubation. For application onembryos, defects in development and the percent that survive toadulthood are determined. For larvae, defects in behavior, locomotion,and molting may be observed. For application on adults, behavior andneurological defects are observed, and effects on fertility are noted.Any deleterious effect on insect survival, motility and fertilityindicates that the compound has utility in controlling pests.

[0141] For a chronic exposure assay, adults are placed on vialscontaining the compounds for 48 hours, then transferred to a cleancontainer and observed for fertility, neurological defects, and death.

[0142] Assay of Compounds Using Cell Cultures

[0143] Compounds that modulate (e.g. block or enhance) p53 activity maybe tested on cells expressing endogenous normal or mutant p53s, and/oron cells transfected with vectors that express p53, or derivatives orfragments of p53. The compounds are added at varying concentration andtheir ability to modulate the activity of p53 genes is determined usingany of the assays for tumor suppressor genes described above (e.g. bymeasuring transcription activity, apoptosis, proliferation/cell cycle,and/or transformation). Compounds that selectively modulate p53 areidentified as potential drug candidates having p53 specificity.

[0144] Identification of small molecules and compounds as potentialpharmaceutical compounds from large chemical libraries requireshigh-throughput screening (HTS) methods (Bolger, Drug Discovery Today(1999) 4:251-253). Several of the assays mentioned herein can lendthemselves to such screening methods. For example, cells or cell linesexpressing wild type or mutant p53 protein or its fragments, and areporter gene can be subjected to compounds of interest, and dependingon the reporter genes, interactions can be measured using a variety ofmethods such as color detection, fluorescence detection (e.g. GFP),autoradiography, scintillation analysis, etc.

[0145] Agricultural Uses of Insect p53 Sequences

[0146] Insect p53 genes may be used in controlling agriculturallyimportant pest species. For example, the proteins, genes, and RNAsdisclosed herein, or their fragments may have activity in modifying thegrowth, feeding and/or reproduction of crop-damaging insects, or insectpests of farm animals or of other animals. In general, effectivepesticides exert a disabling activity on the target pest such aslethality, sterility, paralysis, blocked development, or cessation offeeding. Such pests include egg, larval, juvenile and adult forms offlies, mosquitos, fleas, moths, beetles, cicadia, grasshoppers, aphidsand crickets. The functional analyses of insect p53 genes describedherein has revealed roles for these genes and proteins in controllingapoptosis, response to DNA damaging agents, and protection of cells ofthe germline. Since overexpression of DMp53 induces apoptosis inDrosophila, the insect p53 genes and proteins in an activated form haveapplication as “cell death” genes which if delivered to or expressed inspecific target tissues such as the gut, nervous system, or gonad, wouldhave a use in controlling insect pests. Alternatively, since DMp53 playsa role in response to DNA damaging agents such as X-rays, interferencewith p53 function in insects has application in sensitizing insects toDNA damaging agents for sterilization. For example, current methods forcontrolling pest populations through the release of irradiated insectsinto the environment (Knipling, J Econ Ent (1955) 48: 459-462; Knipling(1979) U.S. Dept. Agric. Handbook No. 512) could be improved by causingexpression of dominant negative forms of p53 genes, proteins, or RNAs ininsects and most preferably germline tissue of insects, or by exposinginsects to chemical compounds which block p53 function.

[0147] Mutational analysis of insect p53 proteins may also be used inconnection with the control of agriculturally-important pests. In thisregard, mutational analysis of insect p53 genes provides a rationalapproach to determine the precise biological function of this class ofproteins in invertebrates. Further, mutational analysis coupled withlarge-scale systematic genetic modifier screens provides a means toidentify and validate other potential pesticide targets that might beconstituents of the p53 signaling pathway. Tests for pesticidalactivities can be any method known in the art. Pesticides comprising thenucleic acids of the insect p53 proteins may be prepared in a suitablevector for delivery to a plant or animal. Such vectors includeAgrobacterium tumefaciens Ti plasmid-based vectors for the generation oftransgenic plants (Horsch et al., Proc Natl Acad Sci U S A. (1986)83(8):2571-2575; Fraley et al., Proc. Natl. Acad. Sci. USA (1983)80:4803) or recombinant cauliflower mosaic virus for the incoulation ofplant cells or plants (U.S. Pat. No. 4,407,956); retrovirus basedvectors for the introduction of genes into vertebrate animals (Burns etal., Proc. Natl. Acad. Sci. USA (1993) 90:8033-37); and vectors based ontransposable elements for incorporation into invertebrate animals usingvectors and methods already described above. For example, transgenicinsects can be generated using a transgene comprising a p53 geneoperably fused to an appropriate inducible promoter, such as atTA-responsive promoter, in order to direct expression of the tumorsuppressor protein at an appropriate time in the life cycle of theinsect. In this way, one may test efficacy as an insecticide in, forexample, the larval phase of the life cycle (e.g., when feeding does thegreatest damage to crops).

[0148] Recombinant or synthetic p53 proteins, RNAs or their fragments,in wild-type or mutant forms, can be assayed for insecticidal activityby injection of solutions of p53 proteins or RNAs into the hemolymph ofinsect larvae (Blackburn, et al., Appl. Environ. Microbiol. (1998)64(8):303641; Bowen and Ensign, Appl. Environ. Microbiol. (1998)64(8):3029-35). Further, transgenic plants that express p53 proteins orRNAs or their fragments can be tested for activity against insect pests(Estruch et al., Nat. Biotechnol. (1997) 15(2):137-41).

[0149] Insect p53 genes may be used as insect control agents in the formof recombinant viruses that direct the expression of a tumor suppressorgene in the target pest. A variety of suitable recombinant virus systemsfor expression of proteins in infected insect cells are well known inthe art. A preferred system uses recombinant baculoviruses. The use ofrecombinant baculoviruses as a means to engineer expression of toxicproteins in insects, and as insect control agents, has a number ofspecific advantages including host specificity, environmental safety,the availability of vector systems, and the potential use of therecombinant virus directly as a pesticide without the need forpurification or formulation of the tumor suppressor protein (Cory andBishop, Mol. Biotechnol. (1997) 7(3):303-13; and U.S. Pat. Nos.5,470,735; 5,352,451; 5,770,192; 5,759,809; 5,665,349; and 5,554,592).Thus, recombinant baculoviruses that direct the expression of insect p53genes can be used for both testing the pesticidal activity of tumorsuppressor proteins under controlled laboratory conditions, and asinsect control agents in the field. One disadvantage of wild typebaculoviruses as insect control agents can be the amount of time betweenapplication of the virus and death of the target insect, typically oneto two weeks. During this period, the insect larvae continue to feed anddamage crops. Consequently, there is a need to develop improvedbaculovirus-derived insect control agents which result in a rapidcessation of feeding of infected target insects. The cell cycle andapoptotic regulatory roles of p53 in vertebrates raises the possibilitythat expression of tumor suppressor proteins from recombinantbaculovirus in infected insects may have a desirable effect incontrolling metabolism and limiting feeding of insect pests.

[0150] Insect p53 genes, RNAs, proteins or fragments may be formulatedwith any carrier suitable for agricultural use, such as water, organicsolvents and/or inorganic solvents. The pesticide composition may be inthe form of a solid or liquid composition and may be prepared byfundamental formulation processes such as dissolving, mixing, milling,granulating, and dispersing. Compositions may contain an insect p53protein or gene in a mixture with agriculturally acceptable excipientssuch as vehicles, carriers, binders, UV blockers, adhesives, hemecants,thickeners, dispersing agents, preservatives and insect attractants.Thus the compositions of the invention may, for example, be formulatedas a solid comprising the active agent and a finely divided solidcarrier. Alternatively, the active agent may be contained in liquidcompositions including dispersions, emulsions and suspensions thereof.Any suitable final formulation may be used, including for example,granules, powder, bait pellets (a solid composition containing theactive agent and an insect attractant or food substance), microcapsules,water dispersible granules, emulsions and emulsified concentrates.Examples of adjuvant or carriers suitable for use with the presentinvention include water, organic solvent, inorganic solvent, talc,pyrophyllite, synthetic fine silica, attapugus clay, kieselguhr chalk,diatomaceous earth, lime, calcium carbonate, bontonite, fuller's earth,cottonseed hulls, wheat flour, soybean flour, pumice, tripoli, woodflour, walnut shell flour, redwood flour, and lignin. The compositionsmay also include conventional insecticidal agents and/or may be appliedin conjunction with conventional insecticidal agents.

EXAMPLES

[0151] The following examples describe the isolation and cloning of thenucleic acid sequence of SEQ ID NOs:1, 3, 5, 7, 9, and 18, and how thesesequences, derivatives and fragments thereof, and gene products can beused for genetic studies to elucidate mechanisms of the p53 pathway aswell as the discovery of potential pharmaceutical agents that interactwith the pathway.

[0152] These Examples are provided merely as illustrative of variousaspects of the invention and should not be construed to limit theinvention in any way.

Example 1 Preparation of Drosophila cDNA Library

[0153] A Drosophila expressed sequence tag (EST) cDNA library wasprepared as follows. Tissue from mixed stage embryos (0-20 hour),imaginal disks and adult fly heads were collected and total RNA wasprepared. Mitochondrial rRNA was removed from the total RNA byhybridization with biotinylated rRNA specific oligonucleotides and theresulting RNA was selected for polyadenylated mRNA. The resultingmaterial was then used to construct a random primed library. Firststrand cDNA synthesis was primed using a six nucleotide random primer.The first strand cDNA was then tailed with terminal transferase to addapproximately 15 dGTP molecules. The second strand was primed using aprimer which contained a Not1 site followed by a 13 nucleotide C-tail tohybridize to the G-tailed first strand cDNA. The double stranded cDNAwas ligated with BstX1 adaptors and digested with Not1. The cDNA wasthen fractionated by size by electrophoresis on an agarose gel and thecDNA greater than 700 bp was purified. The cDNA was ligated with Not1,BstX1 digested pCDNA-sk+ vector (a derivative of pBluescript,Stratagene) and used to transform E. coli (XL1blue). The finalcomplexity of the library was 6×10⁶ independent clones.

[0154] The cDNA library was normalized using a modification of themethod described by Bonaldo et al. (Genome Research (1996) 6:791-806).Biotinylated driver was prepared from the cDNA by PCR amplification ofthe inserts and allowed to hybridize with single stranded plasmids ofthe same library. The resulting double-stranded forms were removed usingstrepavidin magnetic beads, the remaining single stranded plasmids wereconverted to double stranded molecules using Sequenase (Amersham,Arlington Hills, Ill.), and the plasmid DNA stored at −20° C. prior totransformation. Aliquots of the normalized plasmid library were used totransform E. coli (XL1blue or DH10B), plated at moderate density, andthe colonies picked into a 384-well master plate containing bacterialgrowth media using a Qbot robot (Genetix, Christchurch, UK). The cloneswere allowed to grow for 24 hours at 37° C. then the master plates werefrozen at −80° C. for storage. The total number of colonies picked forsequencing from the normalized library was 240,000. The master plateswere used to inoculate media for growth and preparation of DNA for useas template in sequencing reactions. The reactions were primarilycarried out with primer that initiated at the 5′ end of the cDNAinserts. However, a minor percentage of the clones were also sequencedfrom the 3′ end. Clones were selected for 3′ end sequencing based oneither further biological interest or the selection of clones that couldextend assemblies of contiguous sequences (“contigs”) as discussedbelow. DNA sequencing was carried out using ABI377 automated sequencersand used either ABI FS, dirhodamine or BigDye chemistries (AppliedBiosystems, Inc., Foster City, Calif.).

[0155] Analysis of sequences was done as follows: the traces generatedby the automated sequencers were base-called using the program “Phred”(Gordon, Genome Res. (1998) 8:195-202), which also assigned qualityvalues to each base. The resulting sequences were trimmed for quality inview of the assigned scores. Vector sequences were also removed. Eachsequence was compared to all other fly EST sequences using the BLASTprogram and a filter to identify regions of near 100% identity.Sequences with potential overlap were then assembled into contigs usingthe programs “Phrap”, “Phred” and “Consed” (Phil Green, University ofWashington, Seattle, Wash.;

[0156] http://bozeman.mbt.washington.edu/phrap.docs/phrap.htmi). Theresulting assemblies were then compared to existing public databases andhomology to known proteins was then used to direct translation of theconsensus sequence. Where no BLAST homology was available, thestatistically most likely translation based on codon and hexanucleotidepreference was used. The Pfam (Bateman et al., Nucleic Acids Res. (1999)27:260-262) and Prosite (Hofmann et al., Nucleic Acids Res. (1999)27(1):215-219) collections of protein domains were used to identifymotifs in the resulting translations. The contig sequences were archivedin an Oracle-based relational database (FlyTag™, ExelixisPharmaceuticals, Inc., South San Francisco, Calif.).

Example 2 Other cDNA Libraries

[0157] A Leptinotarsa (Colorado Potato Beetle) library was preparedusing the Lambda ZAP cDNA cloning kit from Stratagene (Stratagene, LaJolla, Calif., cat#200450), following manufacturer's protocols. Theoriginal cDNA used to construct the library was oligo-dt primed usingmRNA from mixed stage larvae Leptinotarsa.

[0158] A Tribolium library was made using pSPORT cDNA libraryconstruction system (Life Technologies, Gaithersburg, Md.), followingmanufacturer's protocols. The original cDNA used to construct thelibrary was oligo-dt primed using mRNA from adult Tribolium.

Example 3 Cloning of the p53 Nucleic Acid from Drosophila (DMp53)

[0159] The TBLASTN program (Altschul et al., supra) was used to querythe FlyTag™ database with a squid p53 protein sequence (GenBankgi:1244762), chosen because the squid sequence was one of only twomembers of the p53 family that had been identified previously from aninvertebrate. The results revealed a single sequence contig, which was960 bp in length and which exhibited highly significant homology tosquid p53 (score=192, P=5×10⁻¹²). Further analysis of this sequence withthe BLASTX program against GenBank protein sequences demonstrated thatthis contig exhibited significant homology to the entire known family ofp53-like sequences in vertebrates, and that it contained codingsequences homologous to the p53 family that encompassed essentially allof the DNA-binding domain, which is the most conserved region of the p53protein family. Inspection of this contig indicated that it was anincomplete cDNA, missing coding regions C-terminal to the presumptiveDNA-binding domain as well as the 3′ untranslated region of the mRNA.

[0160] The full-length cDNA clone was produced by Rapid Amplification ofcDNA ends (RACE; Frohman et al., PNAS (1988) 85:8998-9002). A RACE-readylibrary was generated from Clontech (Palo Alto, Calif.) Drosophilaembryo poly A⁺ RNA (Cat#694-1) using Clontech's Marathon cDNAamplification kit (Cat# K1802), and following manufacturer's directions.The following primers were used on the library to retrieve full-lengthclones: 3′373 CCATGCTGAAGCAATAACCACCGATG SEQ ID NO: 11 3′510GGAACACACGCAAATTAAGTGGTTGGATGG SEQ ID NO: 12 3′566TGATTTTGACAGCGGACCACGGG SEQ ID NO: 13 3′799 GGAAGTTTCTTTTCGCCCGATACACGAGSEQ ID NO: 14 5′164 GGCACAAAGAAAGCACTGATTCCGAGG SEQ ID NO: 15 5′300GGAATCTGATGCAGTTCAGCCAGCAATC SEQ ID NO: 16 5′932 GGATCGCATCCAAGACGAACGCCSEQ ID NO: 17

[0161] RACE reactions to obtain additional 5′ and 3′ sequence of theDrosophila p53 cDNA were performed as follows. Each RACE reactioncontained: 40 μl of H₂O, 5 μl of 10×Advantage PCR buffer (Clontech), 1μl of specific p53 RACE primer at 10 μM, 1 μl of AP1 primer (fromClontech Marathon kit) at 10 μM, 1 μl of cDNA, 1 μl of dNTPs at 5 mM, 1μl of Advantage DNA polymerase (Clontech). For 5′ RACE, the reactionscontained either the 3′373, 3′510, 3′566, or 3′799 primers. For 3′ RACE,the reactions contained either the 5′164 or 5′300 primers. The reactionmixtures were subjected to the following thermocycling program steps fortouchdown PCR: (1) 94° C. 1 min, (2) 94° C. 0.5 min, (3) 72° C. 4 min,(4) repeat steps 2-3 four times, (5) 94° C. 0.5 min, (6) 70° C. 4 min,(7) repeat steps 5-6 four times, (8) 94° C. 0.33 min, (9) 68° C. 4 min,(10) repeat steps 8-9 24 times, (11) 68° C. 4 min, (12) remain at 4° C.

[0162] Products of the RACE reactions were analyzed by gelelectrophoresis. Discrete DNA species of the following sizes wereobserved in the RACE products produced with each of the followingprimers: 3′373, approx. 400 bp; 3′510, approx. 550 bp, 3′566, approx.600 bp; 3′799, approx. 850 bp; 5′164, approx. 1400 bp, 5′300 approx.1300 bp. The RACE DNA products were cloned directly into the vectorpCR2.1 using the TOPO TA cloning kit (Invitrogen Corp., Carlsbad,Calif.) following the manufacturers directions. Colonies of transformedE. coli were picked for each construct, and plasmid DNA prepared using aQIAGEN tip 20 kit (QIAGEN, Valencia, Calif.). Sequences of the RACE cDNAinserts in within each clone were determined using standard protocolsfor the BigDye sequencing reagents (Applied Biosystems, Inc. FosterCity, Calif.) and either M13 reverse or BigT7 primers for priming fromflanking vector sequences, or 5′932 or 3′373 primers (described above)for priming internally from Drosophila p53 cDNA sequences. The productswere analyzed using ABI 377 DNA sequencer. Sequences were assembled intoa contig using the Sequencher program (Gene Codes Corporation), andcontained a single open reading frame encoding a predicted protein of385 amino acids, which compared favorably with the known lengths ofvertebrate p53 proteins, 363 to 396 amino acids (Soussi et al., Oncogene(1990) 5:945-952). Analysis of the predicted Drosophila p53 proteinusing the BLASTP homology searching program and the GenBank databaseconfirmed that this protein was a member of the p53 family, since itexhibited highly significant homology to all known p53 related proteins,but no significant homology to other protein families.

Example 4 Cloning of p53 Nucleic Acid Sequences from other Insects

[0163] The PCR conditions used for cloning the p53 nucleic acidsequences comprised a denaturation step of 94° C., 5 min; followed by 35cycles of: 94° C. 1 min, 55° C. 1 min 72° C. 1 min; then, a finalextension at 72° C. 10 min. All DNA sequencing reactions were performedusing standard protocols for the BigDye sequencing reagents (AppliedBiosystems, Inc.) and products were analyzed using ABI 377 DNAsequencers. Trace data obtained from the ABI 377 DNA sequencers wasanalyzed and assembled into contigs using the Phred-Phrap programs.

[0164] The DMp53 DNA and protein sequences were used to query sequencesfrom Tribolium, Leptinotarsa, and Heliothis cDNA libraries using theBLAST computer program, and the results revealed several candidate cDNAclones that might encode p53 related sequences. For each candidate p53cDNA clone, well-separated, single colonies were streaked on a plate andend-sequenced to verify the clones. Single colonies were picked and theplasmid DNA was purified using Qiagen REAL Preps (Qiagen, Inc.,Valencia, Calif.). Samples were then digested with appropriate enzymesto excise insert from vector and determine size. For example, the vectorpOT2,

[0165] (www.fruitfly.org/EST/pOT2vector.html) can be excised withXho1/EcoR1; or pBluescript (Stratagene) can be excised with BssH II.Clones were then sequenced using a combination of primer walking and invitro transposon tagging strategies.

[0166] For primer walking, primers were designed to the known DNAsequences in the clones, using the Primer-3 software (Steve Rozen, HelenJ. Skaletsky (1998) Primer3. Code available athttp://www-genome.wi.mit.edu/genome_software/other/primer3.html.). Theseprimers were then used in sequencing reactions to extend the sequenceuntil the full sequence of the insert was determined.

[0167] The GPS-1 Genome Priming System in vitro transposon kit (NewEngland Biolabs, Inc., Beverly, Mass.) was used for transposon-basedsequencing, following manufacturer's protocols. Briefly, multiple DNAtemplates with randomly interspersed primer-binding sites weregenerated. These clones were prepared by picking 24 colonies/clone intoa Qiagen REAL Prep to purify DNA and sequenced by using supplied primersto perform bidirectional sequencing from both ends of transposoninsertion.

[0168] Sequences were then assembled using Phred/Phrap and analyzedusing Consed. Ambiguities in the sequence were resolved by resequencingseveral clones. This effort resulted in several contiguous nucleotidesequences. For Leptinotarsa, a contig was assembled of 2601 bases inlength, encompassing an open reading frame (ORF) of 1059 nucleotidesencoding a predicted protein of 353 amino acids. The ORF extends frombase 121-1180 of SEQ ID NO:3. For Tribolium, a contig was assembled of1292 bases in length, encompassing an ORF of 1050 nucleotides, extendingfrom base 95-1145 of SEQ ID NO:5, and encoding a predicted protein of350 amino acids. The analysis of another candidate Tribolium p53 clonealso generated a second contig of 509 bases in length, encompassing apartial ORF of 509 nucleotides (SEQ ID NO: 7), and encoding a partialprotein of 170 amino acids. For Heliothis, a contig was assembled of 434bases in length, encompassing a partial ORF of 434 nucleotides (SEQ IDNO:9), and encoding a partial protein of 145 amino acids.

Example 5 Northern Blot Analysis of DMp53

[0169] Northern blot analysis using standard methods was performed usingthree different poly(A)+ mRNA preparations, 0-12 h embryo, 12-24 hembryo, and adult, which were fractionated on an agarose gel along withsize standards and blotted to a nylon membrane. A DNA fragmentcontaining the entire Drosophila p53 coding region was excised by HincIIdigestion, separated by electrophoresis in an agarose gel, extractedfrom the gel, and ³²P-labeled by random-priming using the Rediprimelabeling system (Amersham, Piscataway, N.J.). Hybridization of thelabeled probe to the mRNA blot was performed overnight. The blot waswashed at high stringency (0.2×SSC/0.1% SDS at 65° C.) and mRNA speciesthat specifically hybridized to the probe were detected byautoradiography using X-ray film. The results showed a singlecross-hybridizing mRNA species of approximately 1.6 kilobases in allthree mRNA sources. This data was consistent with the observed sizes ofthe 5′ and 3′ RACE products described above.

Example 6 Cytogenetic Mapping of the DMp53 Gene

[0170] It was of interest to identify the map location of the DMp53 genein order to determine whether any existing Drosophila mutants correspondto mutations in the DMp53 gene, as well as for engineering new mutationswithin this gene. The cytogenetic location of the DMp53 gene wasdetermined by in situ hybridization to polytene chromosomes (Pardue,Meth Cell Biol (1994) 44:333-351) following the protocol outlined below(steps A-C).

[0171] (A) Preparation of polytene chromosome squashes: Dissectedsalivary glands were placed into a drop of 45% acetic acid. Glands weretransferred to drop of 1:2:3 mixture of lactic acid: water:acetic acid.Glands were then squashed between a cover slip and a slide and incubatedat 4° C. overnight. Squashes were frozen in liquid N₂ and the coverslipremoved. Slides were then immediately immersed in 70% ethanol for 10min. and then air dried. Slides were then heat treated for 30 min. at68° C. in 2×SSC buffer. Squashes were then dehydrated by treatment with70% ethanol for 10 min. followed by 95% ethanol for 5 min.

[0172] (B) Preparation of a biotinylated hybridization probe: a solutionwas prepared by mixing: 50 μl of 1 M Tris-HCl pH 7.5, 6.35 μl of 1 MMgCl₂, 0.85 μl of beta-mercaptoethanol, 0.625 μl of 100 mM dATP, 0.625μl of 100 mM dCTP, 0.625 μl of 100 mM dGTP, 125 μl of 2 M HEPES pH 6.6,and 75 μl of 10 mg/ml pd(N)₆ (Pharmacia, Kalamazoo, Minn.). 10 μl ofthis solution was then mixed with 2 μl 10 mg/ml bovine serum albumin, 33μl containing (0.5 μg) DMp53 cDNA fragment denatured by quick boiling, 5μl of 1 mM biotin-16-dUTP (Boehringer Mannheim, Indianapolis, Ind.), and1 μl of Klenow DNA polymerase (2 U) (Boehringer Mannheim). The mixturewas incubated at room temperature overnight and the following componentswere then added:1 μl of 1 mg/ml sonicated denatured salmon sperm DNA,5.5 μl 3 M sodium acetate pH 5.2, and 150 μl ethanol (100%). Aftermixing the solution was stored at −70° C. for 1-2 hr. DNA precipitatewas collected by centrifugation in a microcentrifuge and the pellet waswashed once in 70% ethanol, dried in a vacuum, dissolved in 50 μl TEbuffer, and stored at −20° C.

[0173] (C) Hybridization and staining was performed as follows: 20 μl ofthe probe added to a hybridization solution (112.5 μl formamide; 25 μl20×SSC, pH 7.0; 50 μl 50% dextran sulfate; 62.5 μl distilled H₂O) wasplaced on the squash. A coverslip (22 mm²) was placed on the squash andsealed with rubber cement and placed on the airtight moist chamberovernight at 42° C. Rubber cement was removed by pealing off cement,then coverslip removed in 2×SSC buffer at 37° C. Slides were washedtwice 15 min each in 2×SSC buffer at 37° C. Slides were then washedtwice 15 min each in PBS buffer at room temperature. A mixture of thefollowing “Elite” solution was prepared by mixing:1 ml of PBT buffer(PBS buffer with 0.1% Tween 20), 10 μl of Vectastain A (VectorLaboratories, Burlingame, Calif.), and 10 μl of Vectastain B (VectorLaboratories). The mixture was then allowed to incubate for 30 min. 50μl of the Elite solution was added to the slide then drained off. 75 μlof the Elite solution was added to slide and a coverslip was placed ontothe slide. The slide was incubated in moist chamber 1.5-2 hr at 37° C.The coverslip was then removed in PBS buffer, and the slide was washedtwice 10 min each in PBS buffer.

[0174] A fresh solution of DAB (diaminobenzidine) in PBT buffer was madeby mixing 1 μl of 0.3% hydrogen peroxide with 40 μl 0.5 mg/ml DABsolution. 40 μl of the DAB/peroxide solution was then placed onto eachslide. A coverslip was placed onto the slide and incubated 2 min. Slideswere then examined under a phase microscope and reaction was stopped inPBS buffer when signal was determined to be satisfactory. Slides werethen rinsed in running H₂O for 10 min. and air dried. Finally, slideswere inspected under a compound microscope to assign a chromosomallocation to the hybridization signal. A single clear region ofhybridization was observed on the polytene chromosome squashes which wasassigned to cytogenetic bands 94D2-6.

Example 7 Isolation and Sequence Analysis of a Genomic Clone for theDMP53 Gene

[0175] PCR was used to generate DNA probes for identification of genomicclones containing the DMp53 gene. Each reaction (50 μl total volume)contained 100 ng Drosophila genomic DNA, 2.5 μM each dNTP, 1.5 mM MgCl₂,2 μM of each primer, and 1 μl of TAKARA exTaq DNA polymerase (PanVeraCorp., Madison, Wis.). Reactions were set up with primer pair 5′164 &3′510 (described above), and thermocycling conditions used were asfollows (where 0:00 indicates time in minutes:seconds): initialdenaturation of 94° C., 2:00; followed by 10 cycles of 94° C., 0:30, 58°C. 0:30, 68° C., 4:00; followed by 20 cycles of 94° C., 0:30, 55° C.,0:30, 68° C., 4:00+0:20 per cycle. PCR products were then fractionatedby agarose gel electrophoresis, ³²P-labeled by nick translation, andhybridized to nylon membranes containing high-density arrayed P1 clonesfrom the Berkeley Drosophila Genome Project (University of California,Berkeley, and purchased from Genome Systems, Inc., St. Louis, Mo.). Fourpositive P1 clones were identified: DS01201, DS02942, DS05102, andDS06254, and each clone was verified using a PCR assay with the primerpair described above. To prepare DNA for sequencing, E. coli containingeach P1 clone was streaked to single colonies on LB agar platescontaining 25 μg/ml kanamycin, and grown overnight at 37° C.Well-separated colonies for each P1 clone were picked and used toinoculate 250 ml LB medium containing 25 μg/ml kanamycin and cultureswere grown for 16 hours at 37° C. with shaking. Bacterial cells werecollected by centrifugation, and DNA purified with a Qiagen Maxi-PrepSystem kit (QIAGEN, Inc., Valencia, Calif.). Genomic DNA sequence fromthe P1 clones was obtained using a strategy that combined shotgun anddirected sequencing of a small insert plasmid DNA library derived fromthe P1 clone DNAs (Ruddy et al. Genome Research (1997) 7:441-456). AllDNA sequencing and analysis were performed as descibed before, and P1sequence contigs were analyzed using the BLAST sequence homologysearching programs to identify those that contained the DMp53 gene orother coding regions. This analysis demonstrated that the DMp53 gene wasdivided into 8 exons and 7 introns. In addition, the BLAST analysisindicated the presence of two additional genes that flank the DMp53gene; one exhibited homology to a human gene implicated in nephropathiccystinosis (labeled CTNS-like gene) and the second gene exhibitedhomology to a large family of oxidoreductases. Thus, we couldoperationally define the limits of the DMp53 gene as an 8,805 bpcorresponding the DNA region lying between the putative CTNS-like andoxidoreductase-like genes.

Example 8 Analysis of P53 Nucleic Acid Sequences

[0176] Upon completion of cloning, the sequences were analyzed using thePfam and Prosite programs, and by visual analysis and comparison withother p53 sequences. Regions of cDNA encoding the various domains of SEQID Nos 1-6 are depicted in Table I above. Additionally, Pfam predictedp53 similarity regions for the partial TRIB-Bp53 at amino acid residues118-165 (SEQ ID NO:8) encoded by nucleotides 354-495 (SEQ ID NO:7), andfor the partial HELIOp53 at amino acid residues 105-138 (SEQ ID NO:10)encoded by nucleotides 315-414 (SEQ ID NO:9).

[0177] Nucleotide and amino acid sequences for each of the p53 nucleicacid sequences and their encoded proteins were searched against allavailable nucleotide and amino acid sequences in the public databases,using BLAST (Altschul et al., supra). Tables 2-6 below summarize theresults. The 5 most similar sequences are listed for each p53 gene.TABLE 2 DMp53 GI# DESCRIPTION DNA BLAST of SEQ ID NO: 1 6664917 =C019980 Drosophila melanogaster, ***SEQUENCING IN PROGRESS***, inordered pieces 5670489 = AC008200 Drosophila melanogaster chromosome 3clone BACR17P04 (D757) RPCI-98 17.P.4 map 94D-94E strain y; cn bw sp,***SEQUENCING IN PROGRESS***, 70 unordered pieces. 4419483 = AI516383Drosophila melanogaster cDNA clone LD42237 5prime, mRNA sequence 4420516= AI517416 Drosophila melanogaster cDNA clone GH28349 5prime, mRNAsequence 4419333 = AI516233 Drosophila melanogaster cDNA clone LD420315prime, mRNA sequence PROTEIN BLAST of SEQ ID NO: 2 1244764 = AA98564p53 tumor suppressor homolog [Loligo forbesi] 1244762 = AA98563 p53tumor suppressor homolog [Loligo forbesi] 2828704 = AC31133 tumorprotein p53 [Xiphophorus helleri] 2828706 = AC31134 tumor protein p53[Xiphophorus maculatus] 3695098 = AC62643 DN p63 beta [Mus musculus]

[0178] TABLE 3 CPBp53 GI# DESCRIPTION DNA BLAST of SEQ ID NO: 3 6468070= AC008132 Homo sapiens, complete sequence Chromosome 22q11 PAC Clonepac995o6 In CES-DGCR Region 4493931 = AL034556 Plasmodium falciparumMAL3P5, complete sequence 3738114 = AC004617 Homo sapiens chromosome Y,clone 264, M, 20, complete sequence 4150930 = AC005083 Homo sapiens BACclone CTA-281G5 from 7p15-p21, complete sequence 4006838 = AC006079 Homosapiens chromosome 17, clone hRPK.855_D_21, complete sequence PROTEINBLAST of SEQ ID NO: 4 1244764 = AA98564 p53 tumor suppressor homolog[Loligo forbesi] 1244762 = AA98563 p53 tumor suppressor homolog [Loligoforbesi] 4530686 = AA03817 unnamed protein product [unidentified]4803651 = CAA72225 P73 splice variant [Cercopithecus aethiops] 2370177 =CAA72219 first splice variant [Homo sapiens]

[0179] TABLE 4 TRIB-Ap53 GI# DESCRIPTION DNA BLAST of SEQ ID NO: 55877734 = AW024204 wv01h01.x1 NCI_CGAP_Kid3 Homo sapiens cDNA cloneIMAGE: 2528305 3′, mRNA sequence 16555 = X65053 A. thaliana mRNA foreukaryotic translation initiation factor 4A-2 6072079 = AW101398sd79d06.y1 Gm-c1009 Glycine max cDNA clone GENOME SYSTEMS CLONE ID:Gm-c1009-612 5′, mRNA sequence 6070492 = AW099879 sd17g11.y2 Gm-c1012Glycine max cDNA clone GENOME SYSTEMS CLONE ID: Gm-c1012-2013 5′, mRNAsequence 4105775 = AF049919 Petunia x hybrida PGP35 (PGP35) mRNA,complete cds. PROTEIN BLAST of SEQ ID NO: 6 1244764 = AAA98564 p53 tumorsuppressor homolog [Loligo forbesi] 3273745 = AAC24830 p53 homolog [Homosapiens] 1244762 = AAA98563 p53 tumor suppressor homolog [Loligoforbesi] 3695096 = AAC62642 N p63 gamma [Mus musculus] 3695080 =AAC62634 DN p63 gamma [Homo sapiens]

[0180] TABLE 5 TRIB-Bp53 GI# DESCRIPTION DNA BLAST of SEQ ID NO: 74689085 = AF043641 Barbus barbus p73 mRNA, complete cds 4530689 = A64588Sequence 7 from Patent WO9728186 N/A No further homologies PROTEIN BLASTof SEQ ID NO: 8 4689086 = AAD27752 p73 [Barbus barbus] 4530686 =CAA03817 unnamed protein product [unidentified] 4803651 = CAA72225 P73splice variant [Cercopithecus aethiops] 4530690 = CAA03819 unnamedprotein product [unidentified] 4530684 = CAA03816 unnamed proteinproduct [unidentified]

[0181] TABLE 6 HELIO p53 GI# DESCRIPTION DNA BLAST of SEQ ID NO: 9 N/ANo homologies found PROTEIN BLAST of SEQ ID NO: 10 2781308 = 1YCSA ChainA, p53-53bp2 Complex 1310770 = 1TSRA Chain A, p53 Core Domain In ComplexWith Dna 1310771 = 1TSRB Chain B, p53 Core Domain In Complex With Dna1310772 = 1TSRC Chain C, p53 Core Domain In Complex With Dna 1310960 =1TUPA Chain A, Tumor Suppressor p53 Complexed With Dna

[0182] BLAST analysis using each of the p53 amino acid sequences to findthe number of amino acid residues as the shortest stretch of contiguousnovel amino acids with respect to published sequences indicate thefollowing: 7 amino acid residues for DMp53 and for TRIB-Ap53, 6 aminoacid residues for CPBp53, and 5 amino acid residues for TRIB-Bp53 andHELIOp53.

[0183] BLAST results for each of the p53 amino acid sequences to findthe number of amino acid residues as the shortest stretch of contiguousamino acids for which there are no sequences contained within publicdatabase sharing 100% sequence similarity indicate the following: 9amino acid residues for DMp53, CPBp5, TRIB-Ap53, and TRIB-Bp53, and 6amino acid residues for HELIOp53.

Example 9 Drosophila Genetics

[0184] Fly culture and crosses were performed according to standardprocedures at 22-25° C. (Ashburner, supra). Gl-DMp53 overexpressionconstructs were made by cloning a BclI HincII fragment spanning theDMp53 open reading frame into a vector (pExPress) containing glassmultiple repeats upstream of a minimal heat shock promoter. The pExPressvector is an adapted version of the pGMR vector (Hay et al., Development(1994) 120:2121-2129) which contains an alpha tubulin 3′ UTR forincreased protein stabilization and a modified multiple cloning site.Standard P-element mediated germ line transformation was used togenerate transgenic lines containing these constructs (Rubin andSpradling, supra). For X-irradiation experiments, third instar larvae invials were exposed to 4,000 Rads of X-rays using a Faxitron X-raycabinet system (Wheeling, Ill.).

Example 10 Whole-mount RNA in situ Hybridization, TUNEL, andImmunocytochemistry

[0185] In situ hybridization was performed using standard methods (Tautzand Pfeifle, Chromosoma (1989) 98:81-85). DMp53 anti-sense RNA probe wasgenerated by digesting DMp53 cDNA with EcoR1 and transcribing with T7RNA polymerase. For immunocytochemistry, third instar larval eye andwing discs were dissected in PBS, fixed in 2% formaldehyde for 30minutes at room temperature, permeabilized in PBS+0.5% Triton for 15minutes at room temperature, blocked in PBS+5% goat serum, and incubatedwith primary antibody for two hours at room temperature or overnight at4° C. Anti-phospho-histone staining used Anti-phospho-histone H3 MitosisMarker (Upstate Biotechnology, Lake Placid, N.Y.) at a 1:500 dilution.Anti-DMp53 monoclonal antibody staining used hybridoma supernatantdiluted 1:2. Goat anti-mouse or anti-rabbit secondary antibodiesconjugated to FITC or Texas Red (Jackson Immunoresearch, West Grove,Pa.) were used at a 1:200 dilution. Antibodies were diluted in PBS+5%goat serum. TUNEL assay was performed by using the Apoptag Direct kit(Oncor, Gaithersburg, Md.) per manufacturer's protocol with a 0.5%Triton/PBS permeabilization step. Discs were mounted in anti-fadereagent (Molecular Probes, Eugene, Oreg.) and images were obtained on aLeica confocal microscope. BrDU staining was performed as described (deNooij et al., Cell. (1996)87(7):1237-1247) and images were obtained onan Axioplan microscope (Zeiss, Thornwood, N.Y.).

Example 11 Generation of anti-DMp53 Antibodies

[0186] Anti-DMp53 rabbit polyclonal (Josman Labs, Napa, Calif.) andmouse monoclonal antibodies (Antibody Solutions Inc., Palo Alto, Calif.)were generated by standard methods using a full-length DMp53 proteinfused to glutathione-S-transferase (GST-DMp53) as antigen. Inclusionbodies of GST-DMp53 were purified by centrifugation using B-PER buffer(Pierce, Rockford, Ill.) and injected subcutaneously into rabbits andmice for immunization. The final boost for mouse monoclonal antibodyproduction used intravenous injection of soluble GST-DMp53 produced bysolubilization of GST-DMp53 in 6M GuHCl and dialysis into phosphatebuffer containing 1M NaCl. Hybridoma supernatants were screened by ELISAusing a soluble 6×HIS-tagged DMp53 protein bound to Ni-NTA coated plates(Qiagen, Valencia, Calif.) and an anti-mouse IgG Fc-fragment specificsecondary antibody.

Example 12 Functional Analysis

[0187] The goal of this series of experiments was to compare andcontrast the functions of the insect p53s to those of the human p53. TheDMp53 was chosen to carry out this set of experiments, although any ofthe other insect p53s could be used as well.

[0188] p53 Involvement in the Cell Death Pathway

[0189] To determine whether DMp53 can serve the same functions in vivoas human p53, DMp53 was ectopically expressed in Drosophila larval eyediscs using glass-responsive enhancer elements. The glass-DMp53(gl-DMp53) transgene expresses DMp53 in all cells posterior to themorphogenetic furrow. During eye development, the morphogenetic furrowsweeps from the posterior to the anterior of the eye disc. Thus,gl-DMp53 larvae express DMp53 in a field of cells which expands from theposterior to the anterior of the eye disc during larval development.

[0190] Adult flies carrying the gl-DMp53 transgene were viable but hadsmall, rough eyes with fused ommatidia (any of the numerous elements ofthe compound eye). TUNEL staining of gl-DMp53 eye discs showed that thisphenotype was due, at least in part, to widespread apoptosis in cellsexpressing DMp53. Results were confirmed by the detection of apoptoticcells with acridine orange and Nile Blue. TUNEL-positive cells appearedwithin 15-25 cell diameters of the furrow. Given that the furrow movesapproximately 10 cell diameters per hour, this indicated that the cellsbecame apoptotic 2-3 hours after DMp53 was expressed. Surprisingly,co-expression of the baculovirus cell death inhibitor p35 did not blockthe cell death induced by DMp53 (Miller, J Cell Physiol (1997)173(2):178-182; Ohtsubo et al., Nippon Rinsho (1996) 54(7):1907-1911).However, DMp53-induced apoptosis and the rough-eye phenotype in gl-DMp53flies could be suppressed by co-expression of the humancyclin-dependent-kinase inhibitor p21. Because p21 overexpression blockscells in the G1 phase of the cell cycle, this finding suggests thattransit through the cell cycle sensitizes cells to DMp53-inducedapoptosis. A similar effect of p21 overexpression on human p53-inducedapoptosis has been described.

[0191] p53 Involvement in the Cell Cycle

[0192] In addition to its ability to affect cell death pathways,mammalian p53 can induce cell cycle arrest at the G1 and G2/Mcheckpoints. In the Drosophila eye disc, the second mitotic wave is asynchronous, final wave of cell division posterior to the morphogeneticfurrow. This unique aspect of development provides a means to assay forsimilar effects of DMp53 on the cell. The transition of cells from Gl toS phase can be detected by BrdU incorporation. Eye discs dissected fromwild-type third instar larvae displayed a tight band of BrdU-stainingcells corresponding to DNA replication in the cells of the secondmitotic wave. This transition from G1 to S phase was unaffected by DMp53overexpression from the gl-DMp53 transgene. In contrast, expression ofhuman p21 or a Drosophila homologue, dacapo (de Nooij et al., Cell(1996) 87(7):1237-1247; Lane et al., Cell (1996) 87(7):1225-1235), undercontrol of glass-responsive enhancer elements completely blocked DNAreplication in the second mitotic wave. In mammalian cells, p53 inducesa cell cycle block in G1 through transcriptional activation of the p21gene. These results suggest that this function is not conserved inDMp53.

[0193] In wild-type eye discs, the second mitotic wave typically forms adistinct band of cells that stain with an anti-phospho-histone antibody.In gl-DMp53 larval eye discs, this band of cells was significantlybroader and more diffuse, suggesting that DMp53 alters the entry intoand/or duration of M phase.

[0194] p53 Response to DNA Damage

[0195] The following experiments were performed to determine whetherloss of DMp53 function affected apoptosis or cell cycle arrest inresponse to DNA damage.

[0196] In order to examine the phenotype of tissues deficient in DMp53function, dominant-negative alleles of DMp53 were generated. Thesemutations are analogous to the R175H (R155H in DMp53) and H179N (H159Nin DMp53) mutations in human p53. These mutations in human p53 act asdominant-negative alleles, presumably because they cannot bind DNA butretain a functional tetramerization domain. Co-expression of DMp53 R155Hwith wild-type DMp53 suppressed the rough eye phenotype that normallyresults from wild type DMp53 overexpression, confirming that this mutantacts as a dominant-negative allele in vivo. Unlike wild type DMp53,overexpression of DMp53 R155H or H159N using the glass enhancer did notproduce a visible phenotype, although subtle alterations in the bristlesof the eye were revealed by scanning electron microscopy.

[0197] In mammalian systems, p53-induced apoptosis plays a crucial rolein preventing the propagation of damaged DNA. DNA damage also leads toapoptosis in Drosophila. To determine if this response requires theaction of DMp53, dominant-negative DMp53 was expressed in the posteriorcompartment of the wing disc. Following X-irradiation, wing discs weredissected. TUNEL staining revealed apoptotic cells and anti-DMp53antibody revealed the expression pattern of dominant-negative DMp53.Four hours after X-irradiation, wild type third instar larval wing discsshowed widespread apoptosis. When the dominant-negative allele of DMp53was expressed in the posterior compartment of the wing disc, apoptosiswas blocked in the cells expressing DMp53. Thus, induction of apoptosisfollowing X-irradiation requires the function of DMp53. Thispro-apoptotic role for DMp53 appears to be limited to a specificresponse to cellular damage, because developmentally programmed celldeath in the eye and other tissues is unaffected by expression of eitherdominant-negative DMp53 allele. The requirement for DMp53 in theapoptotic response to X-irradiation suggests that DMp53 may be activatedby DNA damage. In mammals, p53 is activated primarily by stabilizationof p53 protein.

[0198] Although DMp53 function is required for X-ray induced apoptosis,it does not appear to be necessary for the cell cycle arrest induced bythe same dose of irradiation. In the absence of irradiation, a randompattern of mitosis was observed in 3rd instar wing discs of Drosophila.Upon irradiation, a cell cycle block occured in wild-type discs asevidenced by a significant decrease in anti-phospho-histone staining.The cell cycle block was unaffected by expression of dominant-negativeDMp53 in the posterior of the wing disc. Several time points afterX-irradiation were examined and all gave similar results, suggestingthat both the onset and maintenance of the X-ray induced cell cyclearrest is independent of DMp53.

[0199] p53 in Normal Development

[0200] Similar to p53 in mice, DMp53 does not appear to be required fordevelopment because widespread expression of dominant-negative DMp53 inDrosophila had no significant effects on appearance, viability, orfertility. Interestingly, in situ hybridization of developing embryosrevealed widespread early embryonic expression that became restricted toprimordial germ cells in later embryonic stages. This expression patternmay indicate a crucial role for DMp53 in protecting the germ line,similar to the proposed role of mammalian p53 in protection againstteratogens.

Example 13 P53 RNAi Experiments in Cell Culture

[0201] Stable Drosophila S2 cell lines expressing hemaglutinin epitope(HA) tagged p53, or vector control under the inducible metallothionenpromoter were produced by transfection using pMT/V5-His (Invitrogen,Carlsbad, Calif.). Induction of DMp53 expression by addition of copperto the medium resulted in cell death via apoptosis. Apoptosis wasmeasured by three different methods: a cell proliferation assay; FACSanalysis of the cell population in which dead cells were detected bytheir contracted nuclei; and a DNA ladder assay. The ability to use RNAiin S2 cell lines allowed p53 regulation and function to be exploredusing this inducible cell-based p53 expression system.

[0202] Preparation of the dsRNA template: PCR primers containing anupstream T7 RNA polymerase binding site and downstream DMp53 genesequences were designed such that sequences extending from nucleotides128 to 1138 of the DMp53 cDNA sequence (SEQ ID NO:1) could be amplifiedin a manner that would allow the generation of a DMp53-derived dsRNA.PCR reactions were performed using EXPAND High Fidelity (BoehringerMannheim, Indianapolis, Ind.) and the products were then purified.

[0203] DMp53 RNA was generated from the PCR template using the PromegaLarge Scale RNA Production System (Madison, Wis.) followingmanufacturer's protocols. Ethanol precipitation of RNA was performed andthe RNA was annealed by a first incubation at 68° C. for 10 min,followed by a second incubation at 37° C. for 30 min. The resultingdsRNA was stored at −80° C.

[0204] RNAi experiment in tissue culture: RNAi was performed essentiallyas described previously(http://dixonlab.biochem.med.umich.edu/protocols/RNAiExperiments.html).On day 1, cultures of Drosophila S2 cells were obtained that expressedpMT-HA-DMp53 expression plasmid and either 15 μg of DMp53 dsRNA or noRNA was added to the medium. On the second day, CuSO₄was added to finalconcentrations of either 0, 7, 70 or 700 μM to all cultures. On thefourth day, an alamarBlue (Alamar Biosciences Inc., Sacramento, Calif.)staining assay was performed to measure the number of live cells in eachculture, by measuring fluorescence at 590 nm.

[0205] At 7 μM CuSO₄, there was no change in cell number from 0 μM CUSO₄for RNAi treated or untreated cells. At 70 μM CuSO₄, there was no changein cell number from 0 μM CuSO₄ for the RNAi-treated category. However,the number of cells that were not treated with RNAi dropped by 30%. At700 μM CUSO₄, the number of cells that were treated with RNAi dropped by30% (as compared with 0 μM CuSO₄), while the number of cells that werenot treated with RNAi dropped by 70%.

[0206] These experiments showed that p53 dsRNA rescued at least 70% ofthe cells in the p53 inducible category, since some cell loss might beattributable to copper toxicity. Results of these experimentsdemonstrate that DMp53 dsRNA rescues cells from apoptosis caused byinducing DMp53 overexpression. Thus, this experimental cell-based systemrepresents a defined and unique way to study the mechanisms of p53function and regulation.

1 35 1 1573 DNA Drosophila melanogaster 1 aaaatccaaa tagtcggtggccactacgat tctgtagttt tttgttagcg aatttttaat 60 atttagcctc cttccccaacaagatcgctt gatcagatat agccgactaa gatgtatata 120 tcacagccaa tgtcgtggcacaaagaaagc actgattccg aggatgactc cacggaggtc 180 gatatcaagg aggatattccgaaaacggtg gaggtatcgg gatcggaatt gaccacggaa 240 cccatggcct tcttgcagggattaaactcc gggaatctga tgcagttcag ccagcaatcc 300 gtgctgcgcg aaatgatgctgcaggacatt cagatccagg cgaacacgct gcccaagcta 360 gagaatcaca acatcggtggttattgcttc agcatggttc tggatgagcc gcccaagtct 420 ctttggatgt actcgattccgctgaacaag ctctacatcc ggatgaacaa ggccttcaac 480 gtggacgttc agttcaagtctaaaatgccc atccaaccac ttaatttgcg tgtgttcctt 540 tgcttctcca atgatgtgagtgctcccgtg gtccgctgtc aaaatcacct tagcgttgag 600 cctttgacgg ccaataacgcaaaaatgcgc gagagcttgc tgcgcagcga gaatcccaac 660 agtgtatatt gtggaaatgctcagggcaag ggaatttccg agcgtttttc cgttgtagtc 720 cccctgaaca tgagccggtctgtaacccgc agtgggctca cgcgccagac cctggccttc 780 aagttcgtct gccaaaactcgtgtatcggg cgaaaagaaa cttccttagt cttctgcctg 840 gagaaagcat gcggcgatatcgtgggacag catgttatac atgttaaaat atgtacgtgc 900 cccaagcggg atcgcatccaagacgaacgc cagctcaata gcaagaagcg caagtccgtg 960 ccggaagccg ccgaagaagatgagccgtcc aaggtgcgtc ggtgcattgc tataaagacg 1020 gaggacacgg agagcaatgatagccgagac tgcgacgact ccgccgcaga gtggaacgtg 1080 tcgcggacac cggatggcgattaccgtctg gctattacgt gccccaataa ggaatggctg 1140 ctgcagagca tcgagggcatgattaaggag gcggcggctg aagtcctgcg caatcccaac 1200 caagagaatc tacgtcgccatgccaacaaa ttgctgagcc ttaagaaacg tgcctacgag 1260 ctgccatgac ttctgatctggtcgacaatc tcccaggtat cagatacctt tgaaatgtgt 1320 tgcatctgtg gggtatactacatagctatt agtatcttaa gtttgtatta gtccttgttc 1380 gtaaggcgtt taacggtgatattccccttt tggcatgttc gatggccgaa aagaaaacat 1440 ttttatattt ttgatagtatactgttgtta actgcagttc tatgtgacta cgtaactttt 1500 gtctaccaca acaaacatactctgtacaaa aaagccaaaa gtgaatttat taaagagttg 1560 tcatattttg caa 1573 2385 PRT Drosophila melanogaster 2 Met Tyr Ile Ser Gln Pro Met Ser TrpHis Lys Glu Ser Thr Asp Ser 1 5 10 15 Glu Asp Asp Ser Thr Glu Val AspIle Lys Glu Asp Ile Pro Lys Thr 20 25 30 Val Glu Val Ser Gly Ser Glu LeuThr Thr Glu Pro Met Ala Phe Leu 35 40 45 Gln Gly Leu Asn Ser Gly Asn LeuMet Gln Phe Ser Gln Gln Ser Val 50 55 60 Leu Arg Glu Met Met Leu Gln AspIle Gln Ile Gln Ala Asn Thr Leu 65 70 75 80 Pro Lys Leu Glu Asn His AsnIle Gly Gly Tyr Cys Phe Ser Met Val 85 90 95 Leu Asp Glu Pro Pro Lys SerLeu Trp Met Tyr Ser Ile Pro Leu Asn 100 105 110 Lys Leu Tyr Ile Arg MetAsn Lys Ala Phe Asn Val Asp Val Gln Phe 115 120 125 Lys Ser Lys Met ProIle Gln Pro Leu Asn Leu Arg Val Phe Leu Cys 130 135 140 Phe Ser Asn AspVal Ser Ala Pro Val Val Arg Cys Gln Asn His Leu 145 150 155 160 Ser ValGlu Pro Leu Thr Ala Asn Asn Ala Lys Met Arg Glu Ser Leu 165 170 175 LeuArg Ser Glu Asn Pro Asn Ser Val Tyr Cys Gly Asn Ala Gln Gly 180 185 190Lys Gly Ile Ser Glu Arg Phe Ser Val Val Val Pro Leu Asn Met Ser 195 200205 Arg Ser Val Thr Arg Ser Gly Leu Thr Arg Gln Thr Leu Ala Phe Lys 210215 220 Phe Val Cys Gln Asn Ser Cys Ile Gly Arg Lys Glu Thr Ser Leu Val225 230 235 240 Phe Cys Leu Glu Lys Ala Cys Gly Asp Ile Val Gly Gln HisVal Ile 245 250 255 His Val Lys Ile Cys Thr Cys Pro Lys Arg Asp Arg IleGln Asp Glu 260 265 270 Arg Gln Leu Asn Ser Lys Lys Arg Lys Ser Val ProGlu Ala Ala Glu 275 280 285 Glu Asp Glu Pro Ser Lys Val Arg Arg Cys IleAla Ile Lys Thr Glu 290 295 300 Asp Thr Glu Ser Asn Asp Ser Arg Asp CysAsp Asp Ser Ala Ala Glu 305 310 315 320 Trp Asn Val Ser Arg Thr Pro AspGly Asp Tyr Arg Leu Ala Ile Thr 325 330 335 Cys Pro Asn Lys Glu Trp LeuLeu Gln Ser Ile Glu Gly Met Ile Lys 340 345 350 Glu Ala Ala Ala Glu ValLeu Arg Asn Pro Asn Gln Glu Asn Leu Arg 355 360 365 Arg His Ala Asn LysLeu Leu Ser Leu Lys Lys Arg Ala Tyr Glu Leu 370 375 380 Pro 385 3 2600DNA Leptinotarsa decemlineata 3 gtgtttagtt attgttcggg ggctgtttttttaattaaaa atttcacggg taaatctttg 60 ttgtcttttc tttttctaat tgtatcagaatagctttttt aactgtgaaa accggaaggg 120 atgtcttctc agtcagactt tttacctccagatgttcaaa atttcctctt ggcagaaatg 180 gaaggggaca atatggataa tctaaactttttcaaggacg aaccaacttt gaatgattta 240 aattattcaa acatcctaaa tggatcaatagttgctaatg atgattcaaa gatggttcat 300 cttatttttc cgggagtaca aacaagtgtcccatcaaatg atgaatacga tggtccatat 360 gaatttgaag tagatgttca tcccactgtggcaaaaaatt cgtgggtgta ctctaccacc 420 ctgaataaag tttatatgac aatgggcagtccatttcctg tagatttcag agtatcacat 480 cgacccccga acccattatt catcaggagcactcccgttt acagtgctcc ccaatttgct 540 caagaatgtg tttaccggtg cctaaaccatgaattctctc ataaagagtc tgatggagat 600 ctcaaggaac acattcgccc tcatatcataagatgtgcca atcagtatgc tgcttactta 660 ggtgacaagt ctaaaaatga acgtctcagcgttgtcatac cattcggtat cccgcagacg 720 ggtactgaaa gtgttagaga aattttcgaatttgtttgca aaaattcttg cccaagtcct 780 ggaatgaata gaagagctgt ggaaataatattcactttgg aggataatca aggaactatc 840 tatggacgca aaacattaaa tgtgagaatatgctcttgtc caaaacgtga taaagagaaa 900 gatgaaaagg ataacactgc caacactaatctgccgcatg gcaaaaagag aaaaatggag 960 aagccatcaa agaaacccat gcagacacaggcagaaaatg ataccaaaga gtttactctg 1020 accataccgc tggtgggtcg acataatgaacaaaatgtgt tgaagtattg ccatgatttg 1080 atggccgggg aaatcctgcg aaatatcggcaatggtactg aagggccgta caaaatagct 1140 ttaaacaaaa taaacacgtt gatacgtgaaagttccgagt gaccttatca attctatgta 1200 tatttcttat acaattccat tttcatatttccatttgata ataagaaaca ttttagcacc 1260 ttttaatcct acactgcagg gaagtcaatatttctttagt tttttgcatg atattgtttg 1320 ttataacatt ttttttttca acaacaggtgacttgatttt tgtaaggtat ctcattattt 1380 atgtttaaga cctaaaacac gaaaccaaaaacatgaatgg tcattgaatt tggctcgata 1440 atcaatccaa tgttctttaa agtaatatcgacctgttcac aacttttgtg atgcactgaa 1500 tggcttttta ttattattat ttttcagcattgtacatcat acttgcatag tttcagtttt 1560 aaatttttca aatgtttcat ttattttcattcttacacct gaacttggat tttggacaca 1620 tggctttcac aatgttctat cacgaacagtatgataagcc aaagtaagag ttgataatag 1680 ttcatattaa tatctattgt aacaccgactattgttatat aaatagtcgt ttttttgtta 1740 cttttcttgc tttattttat acacttgagtcaagtgtagt cagtacattg actatgctgg 1800 aaaacctgtt ttgagtttat ttttacttacattcagttct catcattaga aattgtttat 1860 tttttgtgtg caatatttac gaaaaatggtgcaatactat aataggaaca ttaataaagt 1920 aacttgaaag catagaggtg gtgaattttgtttttgatca actttttgaa atttatgcgc 1980 cattctataa gccagttttt tttgataaattcaaaattca cgaataggta tcaacctgat 2040 tgcatgctta ttctatgttt gtcctaaagcaggtctctat aaaacttctc taaaagttgt 2100 gcagagcaaa taacaaataa ttttttaatggattatatca attcatgaac tggtttaatt 2160 gaaagagtag attattctat tgggttcacaaaaatataaa taatgtgtta ctatctggat 2220 catttgtttt tttttcattg agctatattttgtcattgta ttgttgaact ttccctaaat 2280 cccagtgcca tagtcgacga tcggtctcgctcccatccat caattattcg aaatctcatt 2340 tattttaaag actgaggacg gggtgggactgtcagtgtat ctgtttaatg agaaccatct 2400 tgtactagga ttgatatgtg aatctatgagtaggtgcatt tttatatata tatctttatg 2460 tttatttagt attattgtac aggttatgtactctagtgga agaatacata acctaattat 2520 tatatatgtt cgttaatata caaattttttacgtttttaa aatatatttt ctaaatattc 2580 aacaaaaaaa aaaaaaaaaa 2600 4 354PRT Leptinotarsa decemlineata 4 Met Ser Ser Gln Ser Asp Phe Leu Pro ProAsp Val Gln Asn Phe Leu 1 5 10 15 Leu Ala Glu Met Glu Gly Asp Asn MetAsp Asn Leu Asn Phe Phe Lys 20 25 30 Asp Glu Pro Thr Leu Asn Asp Leu AsnTyr Ser Asn Ile Leu Asn Gly 35 40 45 Ser Ile Val Ala Asn Asp Asp Ser LysMet Val His Leu Ile Phe Pro 50 55 60 Gly Val Gln Thr Ser Val Pro Ser AsnAsp Glu Tyr Asp Gly Pro Tyr 65 70 75 80 Glu Phe Glu Val Asp Val His ProThr Val Ala Lys Asn Ser Trp Val 85 90 95 Tyr Ser Thr Thr Leu Asn Lys ValTyr Met Thr Met Gly Ser Pro Phe 100 105 110 Pro Val Asp Phe Arg Val SerHis Arg Pro Pro Asn Pro Leu Phe Ile 115 120 125 Arg Ser Thr Pro Val TyrSer Ala Pro Gln Phe Ala Gln Glu Cys Val 130 135 140 Tyr Arg Cys Leu AsnHis Glu Phe Ser His Lys Glu Ser Asp Gly Asp 145 150 155 160 Leu Lys GluHis Ile Arg Pro His Ile Ile Arg Cys Ala Asn Gln Tyr 165 170 175 Ala AlaTyr Leu Gly Asp Lys Ser Lys Asn Glu Arg Leu Ser Val Val 180 185 190 IlePro Phe Gly Ile Pro Gln Thr Gly Thr Glu Ser Val Arg Glu Ile 195 200 205Phe Glu Phe Val Cys Lys Asn Ser Cys Pro Ser Pro Gly Met Asn Arg 210 215220 Arg Ala Val Glu Ile Ile Phe Thr Leu Glu Asp Asn Gln Gly Thr Ile 225230 235 240 Tyr Gly Arg Lys Thr Leu Asn Val Arg Ile Cys Ser Cys Pro LysArg 245 250 255 Asp Lys Glu Lys Asp Glu Lys Asp Asn Thr Ala Asn Thr AsnLeu Pro 260 265 270 His Gly Lys Lys Arg Lys Met Glu Lys Pro Ser Lys LysPro Met Gln 275 280 285 Thr Gln Ala Glu Asn Asp Thr Lys Glu Phe Thr LeuThr Ile Pro Leu 290 295 300 Val Gly Arg His Asn Glu Gln Asn Val Leu LysTyr Cys His Asp Leu 305 310 315 320 Met Ala Gly Glu Ile Leu Arg Asn IleGly Asn Gly Thr Glu Gly Pro 325 330 335 Tyr Lys Ile Ala Leu Asn Lys IleAsn Thr Leu Ile Arg Glu Ser Ser 340 345 350 Glu Trp 5 1291 DNA Triboliumcastaneum 5 acgcgtccgg ccaacttaac ctaaaaattt gttttcgatg cctactagatttaaaaacaa 60 ttgattcaaa tcgtggattt ttattattta aatcatgagc caacaaagtcaattttcgga 120 catcattcct gatgttgata aatttttgga agatcatgga ctcaaggacgatgtgggaag 180 aataatgcac gaaaacaacg tccatttagt aaatgacgac ggagaagaagaaaaatactc 240 taatgaagcc aattacactg aatcaatttt cccccccgac cagcccacaaacctaggcac 300 tgaggaatac ccaggccctt ttaatttctc agtcctgatc agccccaacgagcaaaaatc 360 gccctgggag tattcggaaa aactgaacaa aatattcatc ggcatcaacgtgaaattccc 420 cgtggccttc tccgtgcaaa accgccccca gaacctgccc ctctacatccgcgccacccc 480 cgtgttcagc caaacgcagc acttccaaga cctggtgcac cgctgcgtcggccaccgcca 540 cccccaagac cagtccaaca aaggcgtcgc cccccacatt ttccagcacattattaggtg 600 caccaacgac aacgccctat actttggcga taaaaacaca gggacgagactcaacatcgt 660 cctgcctttg gcccaccccc aggtggggga ggacgtggtc aaggagtttttccagtttgt 720 gtgcaaaaac tcctgccctt tggggatgaa tcggcggccg attgatgtcgttttcaccct 780 ggaggataat aagggggagg ttttcgggag gaggttggtg ggggtgagggtgtgttcgtg 840 tccgaagcgt gacaaggaca aggaggagaa ggacatggag agtgctgtgcctccaaggag 900 gaagaagagg aagttgggga atgatgagcg aagggttgtg ccacaggggagctccgataa 960 taaaatattt gcgttaaata ttcatattcc tggcaagaag aattatttacaagccctcaa 1020 gatgtgtcaa gatatgctgg ctaatgaaat tttgaaaaaa caggaacaaggtggcgacga 1080 ttctgctgat aagaactgtt ataatgagat aactgttctc ttgaacggcacggccgcctt 1140 tgattagttt atttctatat ttaattttat actttgtact tatgcaatattccagtttac 1200 ttttgtaata tttttattaa taaatttcta cgttttaaaa aaaaaaaaaaaaaaaaaaaa 1260 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa a 1291 6 350 PRTTribolium castaneum 6 Met Ser Gln Gln Ser Gln Phe Ser Asp Ile Ile ProAsp Val Asp Lys 1 5 10 15 Phe Leu Glu Asp His Gly Leu Lys Asp Asp ValGly Arg Ile Met His 20 25 30 Glu Asn Asn Val His Leu Val Asn Asp Asp GlyGlu Glu Glu Lys Tyr 35 40 45 Ser Asn Glu Ala Asn Tyr Thr Glu Ser Ile PhePro Pro Asp Gln Pro 50 55 60 Thr Asn Leu Gly Thr Glu Glu Tyr Pro Gly ProPhe Asn Phe Ser Val 65 70 75 80 Leu Ile Ser Pro Asn Glu Gln Lys Ser ProTrp Glu Tyr Ser Glu Lys 85 90 95 Leu Asn Lys Ile Phe Ile Gly Ile Asn ValLys Phe Pro Val Ala Phe 100 105 110 Ser Val Gln Asn Arg Pro Gln Asn LeuPro Leu Tyr Ile Arg Ala Thr 115 120 125 Pro Val Phe Ser Gln Thr Gln HisPhe Gln Asp Leu Val His Arg Cys 130 135 140 Val Gly His Arg His Pro GlnAsp Gln Ser Asn Lys Gly Val Ala Pro 145 150 155 160 His Ile Phe Gln HisIle Ile Arg Cys Thr Asn Asp Asn Ala Leu Tyr 165 170 175 Phe Gly Asp LysAsn Thr Gly Thr Arg Leu Asn Ile Val Leu Pro Leu 180 185 190 Ala His ProGln Val Gly Glu Asp Val Val Lys Glu Phe Phe Gln Phe 195 200 205 Val CysLys Asn Ser Cys Pro Leu Gly Met Asn Arg Arg Pro Ile Asp 210 215 220 ValVal Phe Thr Leu Glu Asp Asn Lys Gly Glu Val Phe Gly Arg Arg 225 230 235240 Leu Val Gly Val Arg Val Cys Ser Cys Pro Lys Arg Asp Lys Asp Lys 245250 255 Glu Glu Lys Asp Met Glu Ser Ala Val Pro Pro Arg Arg Lys Lys Arg260 265 270 Lys Leu Gly Asn Asp Glu Arg Arg Val Val Pro Gln Gly Ser SerAsp 275 280 285 Asn Lys Ile Phe Ala Leu Asn Ile His Ile Pro Gly Lys LysAsn Tyr 290 295 300 Leu Gln Ala Leu Lys Met Cys Gln Asp Met Leu Ala AsnGlu Ile Leu 305 310 315 320 Lys Lys Gln Glu Gln Gly Gly Asp Asp Ser AlaAsp Lys Asn Cys Tyr 325 330 335 Asn Glu Ile Thr Val Leu Leu Asn Gly ThrAla Ala Phe Asp 340 345 350 7 508 DNA Tribolium castaneum 7 gtacgacaatacaaaccgcc cgatttttcc cacactttcc acccaataat ttgctcaatt 60 ttccagttggaagacttcaa attcaacatc aaccaaagct cgtacctctc agcccccatt 120 ttcccccccagcgagccgct cgagctgtgc aacaccgagt accccggccc cctcaacttc 180 gaggtgtttgtggaccccaa cgtgctcaaa aacccctggg aatactcccc aattctcaac 240 aaaatttacatcgatatgaa acacaaattc ccgattaatt tcagcgtgaa gaaggccgat 300 cctgagcgcaggctttttgt cagagttatg ccgatgtttg aggaagacag atatgtgcaa 360 gaattggtgcataggtgcat ctgtcacgaa caattgacag atccgaccaa tcacaacgtt 420 tcggaaatggtggctcagca catcattcgg tgtgataaca acaatgctca gtatttcggg 480 gataagaacgctgggaagag actgagta 508 8 169 PRT Tribolium castaneum 8 Val Arg Gln TyrLys Pro Pro Asp Phe Ser His Thr Phe His Pro Ile 1 5 10 15 Ile Cys SerIle Phe Gln Leu Glu Asp Phe Lys Phe Asn Ile Asn Gln 20 25 30 Ser Ser TyrLeu Ser Ala Pro Ile Phe Pro Pro Ser Glu Pro Leu Glu 35 40 45 Leu Cys AsnThr Glu Tyr Pro Gly Pro Leu Asn Phe Glu Val Phe Val 50 55 60 Asp Pro AsnVal Leu Lys Asn Pro Trp Glu Tyr Ser Pro Ile Leu Asn 65 70 75 80 Lys IleTyr Ile Asp Met Lys His Lys Phe Pro Ile Asn Phe Ser Val 85 90 95 Lys LysAla Asp Pro Glu Arg Arg Leu Phe Val Arg Val Met Pro Met 100 105 110 PheGlu Glu Asp Arg Tyr Val Gln Glu Leu Val His Arg Cys Ile Cys 115 120 125His Glu Gln Leu Thr Asp Pro Thr Asn His Asn Val Ser Glu Met Val 130 135140 Ala Gln His Ile Ile Arg Cys Asp Asn Asn Asn Ala Gln Tyr Phe Gly 145150 155 160 Asp Lys Asn Ala Gly Lys Arg Leu Ser 165 9 433 DNA Heliothisvirescens 9 gcacgagatg aagtgcaact ttagcgtgca attcaactgg gactatcagaaggcgccgca 60 tatgttcgtg cggtctaccg tcgtgttctc cgatgaaacg caggcggagaagcgggtcga 120 acgatgtgtg cagcatttcc atgaaagctc cacttctgga atccaaacagaaattgccaa 180 aaacgtgctc cactcgtccc gggagatcgg tacccagggc gtgtactactgcgggaaggt 240 ggacatggca gactcgtggt actcagtgct ggtggagttt atgaggaccagctcggagtc 300 ctgctcccat gcgtaccagt tctcctgcaa gaactcttgt gcaaccggcattaataggcg 360 ggctattgcc attattttta cgctggaaga tgctatgggc aacatccacggccgtcagaa 420 agtaggggcg agg 433 10 144 PRT Heliothis virescens 10 HisGlu Met Lys Cys Asn Phe Ser Val Gln Phe Asn Trp Asp Tyr Gln 1 5 10 15Lys Ala Pro His Met Phe Val Arg Ser Thr Val Val Phe Ser Asp Glu 20 25 30Thr Gln Ala Glu Lys Arg Val Glu Arg Cys Val Gln His Phe His Glu 35 40 45Ser Ser Thr Ser Gly Ile Gln Thr Glu Ile Ala Lys Asn Val Leu His 50 55 60Ser Ser Arg Glu Ile Gly Thr Gln Gly Val Tyr Tyr Cys Gly Lys Val 65 70 7580 Asp Met Ala Asp Ser Trp Tyr Ser Val Leu Val Glu Phe Met Arg Thr 85 9095 Ser Ser Glu Ser Cys Ser His Ala Tyr Gln Phe Ser Cys Lys Asn Ser 100105 110 Cys Ala Thr Gly Ile Asn Arg Arg Ala Ile Ala Ile Ile Phe Thr Leu115 120 125 Glu Asp Ala Met Gly Asn Ile His Gly Arg Gln Lys Val Gly AlaArg 130 135 140 11 26 DNA Drosophila melanogaster 11 ccatgctgaagcaataacca ccgatg 26 12 30 DNA Drosophila melanogaster 12 ggaacacacgcaaattaagt ggttggatgg 30 13 23 DNA Drosophila melanogaster 13 tgattttgacagcggaccac ggg 23 14 28 DNA Drosophila melanogaster 14 ggaagtttcttttcgcccga tacacgag 28 15 27 DNA Drosophila melanogaster 15 ggcacaaagaaagcactgat tccgagg 27 16 28 DNA Drosophila melanogaster 16 ggaatctgatgcagttcagc cagcaatc 28 17 23 DNA Drosophila melanogaster 17 ggatcgcatccaagacgaac gcc 23 18 27425 DNA Drosophila melanogaster 18 tagccactcgctagtttata gttcaaggtg aacatacgta agagttttgt ggcactggac 60 tggaaataggctgctagtcc tttgtgttcg gccatagcgt taaaaattta agccaacgcc 120 agtcgtcctgcgcccatgtt gctgcaacat tctggcttcg tgtcatgcca ctgaatgttt 180 cacattatttaacccccttt attttttttt tttgtgtggc actggccaaa ggtccaaagg 240 ggcgacatgctgcaggggcg tggcctgcag ctgcttgcaa cgggcaatta ttgcgcagtt 300 attgcatgtcgtgtgcaatg cctatgaatt attacgtata cacagtgtgt cctcggcaat 360 aacgaaagtccgggaggggg cggggcggta ttcatgctgc agttgcccat aaattcaacg 420 aaattgctacagtttttatt tgtaatgact gggcatggta agttaatatg attcttcata 480 ctgattaagtgcttttgtta cttttttaat tattcaagta aaaatattaa tttgtgtttc 540 atgggactttttgtagtagt taccctacta ctacattaaa cattaatttc aaagaagtag 600 atatacgagtaaatgggcaa tatgaaaatt tgaaaaaggt aaagcttatg atactaacta 660 atgccaaatgaaaactagga gtatgataat aatatgaaga tagcccacca ggctatccca 720 aaatcgtcatcaaatccaat ggtgttcatt aaattaggta atcgcatgtg cccttatgtc 780 aaccatatcgccgctcaacc aagtcatttc ggtcgctgag gcaatcgaga tatggggcgc 840 caccgaccttggccaacatg ctccacattg ggctccaagt ggcaaccgca aaggtcacgc 900 acagttcgccattgcgaatc gcatactgcc aatggaaact acattgcgta tctggtggcc 960 ctttgatggcgctctaatta aaggctacct gccactaatt agtgatagac aatcgtcggg 1020 ggagttcgggtggcatcgtt ggcaggcact taacccaaga caggggggcc aactggcatt 1080 ggatggccgtttttgaattc gtatgtcgga agcagtcgat gcagggttgg gggggatgga 1140 aacaaatgttgtcaacgcca aaaccactga actgttaaaa gtgccattga atccaacaag 1200 gatgctgggcgcaactgtgc aacctaacaa actgtcggaa agacagcagc aacatgggca 1260 tgcatggcttgatactggga gtctgttcga tggatcccac ttgaaccgaa ccgtactgaa 1320 ccgtgccccggccagatgag gcgccccacc caacgccact cttgaaaacc ccaagccctt 1380 tgcacgcgctaaatagtttt gtttattgca cattgaaacc gagccagcga gcaattccgg 1440 tggctgctccgcgcgcgaca cactccagcg atctaatcag caatctcgac gacgaccggg 1500 ctgacatggggtttctcata cgctcggtta gacgcgacgt cgacgctcga tcgaatattt 1560 tcccaatgcactggcagaaa atgtgtggaa gtgtgagatt aagctcataa attagtagtg 1620 cacttaatgtggaaaatatt agaaacaaca gtgaacagtt gattggttct cttataaatt 1680 ttattaattattgaacattt gaagaaagat attgattaaa tcaactttgg atgtatacat 1740 atatataaaaaagtatatga tgactttcat gttgagaggt cataactttg taatgatatt 1800 ggttctagtcatcatttcgt gaaacagctg tgcaagcatt cgattatatg tggtatgtaa 1860 tttatttgggttaatatatt tttcgcagtg tactgcttct gctgcgtcac ttcacattcg 1920 tatcatttacatacgcagca ctgcggagtg agtcgctgag tacctggcgc tctggggtct 1980 ctgggatctctgggcttggg gatggatctc cactcgatga tctctccgcc tgggagccca 2040 gatcatcgtctgctatttgc aagtcgagag tcgcgcgagt cggacgtaca atcgccgcag 2100 cggaatcaagtgtgataaaa gtgaacagaa ctttagccaa gtgcatttgg ctaatggaag 2160 tggtggcaaaagtcaaagcc acacgttata ctcgaattta aaaacaaata aataatgcat 2220 aagcaggcgagtttgaagta attagcacaa cgatgatgct ggcggccaac tgacccacat 2280 cgggaaatcgctctaattca tatttgttgt cgagtgggcc aggataacag gataacagga 2340 tactgctggctcatttgcat ttgcatatat gcaaatagtt cgatctgcag gcgattgagt 2400 gaccgaaagtgttggactgt gccaaataca taaccagcta acgggcaaaa agccactgaa 2460 taaatggcccttgttactcg gttcgtgtaa tgcgtctacg agtttagccc gtgttctgac 2520 cgagaatcaattaaaattta ttgcacgagc atgccaaaca attcgcggtt gcagccacaa 2580 aaacgcatctgaaaaacaat gccaccactc caatcacttg tgaccgcccc ccggctatgc 2640 aaattagccattgcagcgat tttgctaatt ctccagctaa acgctagtgg tgagttctca 2700 gttggctaatatatatatat gtatatatat gaaatatgaa aaatcggaaa acccctttgc 2760 aaacattgctccgcgcttag ctcatgatga tgccaattcc gagagcgttt tgaagatgca 2820 ctcgccatttgcattcaaaa gccaagcgaa taaatggaga agcaaaacca aaactgcata 2880 gatcaatttacaagtcggca aaggggttta ctcgctgcat gtgcatgtca gctgctatta 2940 tagatttatttattggcaaa caccctgaga acgagtttca ttggggggcc taagtgggag 3000 aatgacctacacaggaaagt gctcttaact aagcaactaa cttctggaaa agcggaagtg 3060 gagagattaagtactatctt atagatatgc cagaatatca aaaaagtatc taccagatac 3120 cttgaaagatctctgcatat ctcaattgca attcatgata agtttgttaa gttacgtttt 3180 ttaatttccaattcaacctt tcaattagtt aataacgcca atctcagaca ttcctaaacc 3240 ccctccctacttaagggtaa atcccgatga tgcttgattg attttctcat tgctcagcta 3300 tgcataaaaatatcatatta attgatgagc acgagcttag ctaccagaat tgaaatccat 3360 atgactgctcggcaatttga aaaatgcgtt ggttcccagt catgcgcatc ccgttggatt 3420 gaaacccacattcatggcat tccgttctgc cccccagttg cgctgctgct caagtgtccg 3480 ttgcaccagttgcagctgca gaagatcgtc ggattccggc caccgctgga gtatctgaat 3540 gcggataatcggatctacgg accggaaatg gtgagcaact tcaagactcg caacggccaa 3600 caggaacttccggtcagcca ggtgtgctgg cgcatctgca acgaggatcc cgattgcatt 3660 gcctatgtccatctgctgga cacggacgag tgccatggct actcgtactt cgagcgaacc 3720 tcgcgctatctggccatttc gggtgaactg cctctggtgg cagacggcga ggccgtcttc 3780 tacgaaaagacctgcctccg aggtgagtaa ttctccagcc aaacctccgg aagtggccgt 3840 gatccgcctctaatccattc cgaccttgca gttcccgatg cgtgccgtgg gcgtctctgg 3900 gcactgaccaaaatccccgg cagcacgctg gtctaccaca gcaagaagac catttcgacg 3960 ctggtcacgcggcgtgagtg cgccgagcgc tgcttcttcg aaacccagtt ccgatgcctc 4020 tccgcctcctttgcgccctc ctatcggaac aatcgtgagc ggtaattgac tatttgttgt 4080 ttgttgtttgctatttggtt gtttgttgtt gtcggttgtc agtgggtggt tgttgtagtt 4140 gctggtcgccggacaaatga atagcttttg ttgtgcattt ttaatgcatg gtcgagactt 4200 ttcgccggattatgacatca ctccgaggat ggtgatggga taggttagga ctattcaaca 4260 atgtgtagcaagctaataat atgataatat gatattataa tacgaaagaa agatatatcc 4320 agaagacatcatcttttcga agctatgttc ttttccaaac aaatttttac aaaataagat 4380 aagtatttttgaaaagtgag atcatcagca atcatctaga ttttcttaaa ctcaagtata 4440 tatcgaattcttctgaaata accgaactga cttggtcata atcgacacat catcgtttag 4500 aagttaataaagcaaccttt aaccctcctc tttcgtagct tccgcggcga ggcgggtcct 4560 ggccagcgtccgtctccccg cctcggcaga tgtatgctga gcgacaggga caagaccgtc 4620 cagccggacgcctttcgcgc ggctccatac gacgaggagt acatggagaa ccagtgccac 4680 gaacgggccatcgaaagtga caactgttcc tacgagctgt acgccaacag cagtttcatc 4740 tatgcggaggccaggtattt gggcctctcc caaaaagagg tgtgtccgcc gcgcttcgga 4800 tgtcgcgcattatgattgta atcgaaatgg atggggggtc ggatgattga ttgatggctt 4860 ctacctccgtattgcagtgt caggcgatgt gctcccacga ggcgaagttc tactgccagg 4920 gtgtctccttctactatgta aaccaactct cgctgtccga gtgtctcctc cactcggagg 4980 acattgtatccctgggtccg cgaagcctga agctccgtga aaactcggtg tacatgcgga 5040 gggtcaagtgcctggatggt aagatcttct ggggatgtgg tatgctcaat cttaatcgat 5100 tccttattccgcagtccggg ttttttgcac ccgcgatgag atgaccatta agtacaatcc 5160 caaggactggttcgtcggca agatctatgc cagcatgcac tccaaggact gcctggccag 5220 aggatcgggcaatgggagtg ttctgctgac gctccagatc ggcagcgagg taaaggagaa 5280 ccgctgtggcatcctgcgtg cctacgaaat gacacaggaa taccaaaggt aagatgaagt 5340 ccaatgtccagtccattttt ttaattatat catttgcatt atttagaacg ttcatatctg 5400 ctctggtggtcatccaaaac aatccaaatg tgcaaaccca gggcgaccgg ctcatcaagg 5460 ttggctgtatacagagcaat gccaccacat cgctgggcgt ttcggttcgg gacagcagtg 5520 tggatagctcagagcctgtg cccagcgcca ttgcactgga gtcctcattg gagtacacag 5580 aacagtgagtgtattcttaa tagaatccct caaaatgctt aattctatca caatcgatac 5640 ctgcagcatgttcccacacg agggtgtggt tcactacaac agcagcactg ggccccatcc 5700 gcatcccagcatctcgcttc agattttgga tctatcccac cagcacgaga ccaacgacgt 5760 gcagattggacagaacctgg aactacagat tgtggcggag tacagcccac agcagttggc 5820 agagcacatggagttgcagc tggcaccact acccgacttt cgtgctacct cgctggtggc 5880 caagacagcggacaatgaga actttgtgct gctgatcgac gagcgaggat gtcccacaga 5940 tgccagtgtgtttcccgctt tggaaagggt acacacagcc agcaggagca tgttgcgcgc 6000 tcgcttccatgccttcaagt tctcaggaac ggccaacgta agcttcgatg taaagattcg 6060 cttctgcgtggagcgctgct cgcccagcaa ttgtattagt tcatcctggc aacggagaag 6120 gcgacaggctgaccaaccag atcgtagacc ggaagaccta cgagttcaga accccgtgta 6180 catctccacggtggtggatg tggctccgca accagacaac tttaccagat cgcaggagga 6240 attgcccctcaactacaata tccgggtgca cggtccggac cagagcaaca ccaatagtta 6300 tctgtacggcgagcggggag tgctgctcat tgctggcata gacgacccgc tgcacctgga 6360 taacgtttgcatcaaccaga gcctgctgat tgcactgttc atcttctggc tgatctgtca 6420 agttgccctgctcttcggct gtggaatggt gctgcagcgc taccgccggc tggccaagct 6480 cgaggatgagcgacgcaggc tgcacgagga gtacctggag gcgaggagag tccactgggc 6540 ggatcaaggcggatacacac tctaattgac ggctggaacg caatgcgtat aaaatgcatc 6600 ttaatttaataaacataaat ctaacataaa tctaacaaat gtttgcaacc gaggataagt 6660 tcaggagttcttcttgggat ggtagtgctc ccacttgcga tggtttagcg aattgaaatc 6720 cgggcagtggtgagcgattt tgcgcaaata gtcggacaac ttgagcagct cggtgtccgt 6780 gccacggttgagatgagcct gacggaatgg gcggatcttt aggccggact ttgggttcat 6840 aaggaagttgcgacggatgt catcaaacat gatagtgttg ctcgagttgt attgcttgta 6900 cagggcccagattacaccaa gcggctttac gtccaccaca ccgcgctccg gcacatgaac 6960 tgatatcatggcggtggagt ccagatagaa catcaccttg tagttatcgt tactggccac 7020 gcccagcaggcgcatctttt cctcgatcca gcgcatgctg gtggcggacc agatgacaat 7080 gtcgtagtcctcgtaggcgg aagtcagaaa ctcgtgcaga tacggacgca ttagctccgt 7140 gcctgtttcagcaggcgatc ggtgatcgaa tagggtatag tctatgtcca ggacaagcag 7200 cttcttgccctcacgcggcg gcgctaactc cttgatcttg tagtctcgca cacgacgctg 7260 caccttggccaaatagacgg cggagtgctc cacggactct tcgcgttcat cggcgtcatc 7320 gaagtcgtcgaccacttcgc caatattatc gggcaggctg cacgcatcct cgatatcggc 7380 ctctgtggagcccaccatca taagcttaaa gttgggcttc agctccaaag cgctgatctt 7440 cacattgtcggctgctgtct ttcctgcaag tcattggatc ttaaaactga aatatcccga 7500 agcctaggagtgtcacgcac ctttgtactt caggttgagc agcttttgac gttccggacg 7560 cacctgtgtcttgcggaata tctcgtgacg cagcacttcc acggtgtcct ggtcggtgag 7620 gtccaccgggtactccttac cactccattt tacaatcact accacttctt tgacctccat 7680 cttagctggtttctattccg ctattaattt atcacaccat atatggtaat gtatgtttgt 7740 tggatagaatccagcaagtg gtttgcaata gtgtacctta aagatattaa ctaatttatt 7800 agaagaccatataaacagtc gagttgtcag aagtcgatag atactatcga ttgcaacgcc 7860 cggcgttatcgattgcaatc ggcttgcaat aaaaataatg attttttgat tatatttttc 7920 agagattattaaaaaatatt ttaaattttt taaaattata tatttagcaa ttaaagaaag 7980 tcatgcaaagacatgaggaa tgtccccaag ttgccaatag gcgattgttt cgccagttca 8040 ttggccacactggtcaccag ctgaaaacac aaaaaccgat cgtacagcat aaatttagct 8100 cgaaaatggactaaacaaag acagcgatcc ggaatccgag cggaaacata gtctgcatga 8160 actatctaacgatcctgctg tgcaaccgaa aaccgacgat gctctcgcgc cggaacaagg 8220 agaagtcccagcacaaggag ggcgtggtgg ggaagtacat gaagaaggac accccaccgg 8280 atatttcggtgatcaatgtg tggagcgatc agcgggccaa gaagaaatcg ctgcagcgct 8340 gtgcgagcacctcgcccagc tgcgagttcc atccgcgcag ctcgagcacc agtcggaaca 8400 cctactcctgcacggactcg cagccggact actaccatgc tcgacgagca cagagccaga 8460 tgcccctgcagcagcactcc cactcgcatc ctcactctct gccccacccc tcccatccgc 8520 atgtgcgtagtcatcctccc ctgccgcccc accagttccg cgccagcagc aatcagttga 8580 gtcagaacagcagcaactac gttaatttcg agcagatcga gcggatgcgc cgtcagcagt 8640 cgtcgccactgctgcagacc acatcatcgc cggcgccggg agccggagga ttccagcgca 8700 gctactccaccacccagcgg cagcatcatc cccatctggg tggtgacagc tacgatgcag 8760 atcagggcctgctaagcgcc tcctatgcca acatgttgca actgccccag cggccacact 8820 cgcccgctcactacgccgtc ccgccgcagc agcagcagca tccacagatt catcaacagc 8880 acgcctcgacgccgtttggc tccacgctgc ggttcgatcg agctgccatg tccatcaggg 8940 agcgacagcccaggtatcag ccaactaggt aaactgcctc ttgaagtact atatttgaat 9000 agatagcgcgcgattgataa agtgggtaga gataatatga gcagctcttg attaaaggaa 9060 taatccgtaaaaactacata ttgtcaaaaa gtgcttaata ttattataac ttttaaacaa 9120 tgacaatgcacgaaatgttt tattttcgaa acatttattg ttcaaagatt ttttatttga 9180 taacagattgctttatttat ttacaataag aaaagttgat gtacaaaacc ggtttctact 9240 cgccttacaataattaaaac aataacacaa tatatgattt tctgtacgag gaatataatg 9300 gaatatatatgatatataca acatttttaa acacattttc tcttctgttt ccacagctct 9360 ccgatgcagcagcaacaaca acaacaacaa cagcagcagc agcagctgca gcacacacaa 9420 ctggcagctcacctgggcgg cagctactcc agcgattcgt acccgatcta cgagaatccg 9480 tcccgcgtcatctcgatgcg cgccacgcag tcgcagcgat cggagtcgcc catctacagc 9540 aatacgacggcctcgtcggc cacgctggcc gtggttccgc agcatcatca tcagggtcac 9600 ctggcggtgccatctggaag cgggggagga tccctgagcg gcagcggtcg tggtggcagt 9660 tctggcagtgttcgcggcgc ctctacctca gtgcaatcac tgtacgtccc accgcgaact 9720 ccgcccagtgcggttgccgg agcgggaggc agtgccaatg ggtcgctgca gaaggtacca 9780 tcacagcaatcgctcacgga gcccgaggag ctgcctctgc cgcccggctg ggccactcag 9840 tacacgctacacggtcggaa atactatatt gatcacaatg cgcataccac gcactggaat 9900 catccgttggagcgcgaagg tctgccggtg ggctggcggc gggtggtgtc caagatgcat 9960 ggcacctactatgagaacca gtataccggg cagagccaac gtcagcatcc atgcttgacc 10020 tcctactatgtctacacgac gtctgcggag ccaccgaaag cgattcgacc agaggcgtcg 10080 ctctatgccccacccacgca cactcacaat gcactggtgc cggccaatcc ctatctgctc 10140 gaggagatccccaagtggtt ggccgtctac tcggaggcgg actcgtccaa ggaccacctg 10200 ctgcagttcaacatgtttag cctgccggag ctggagggct tcgacagcat gctggtgcgg 10260 ctcttcaagcaggaactggg caccatcgtg ggcttctacg agcgctaccg gtaagtgagc 10320 ggccacatgccgctgcattc tccgctctcc gaaaagccac tactctcttg ttacaccttt 10380 cagtcgcgctttgatactcg agaagaatcg acgcgccggc cagaaccaga accaaaacca 10440 gtgacccggtgaccaggtga cgactgactc agaccacata ctcgccagca gctatatgca 10500 catcatagtgctcctgtaat cgacctttaa cttatttaac catcgactca tcgcgaaatc 10560 agtgccttatacgaaaccag acgagatggt agccaagcag atccatgaca gttcgaatgc 10620 cttgatgaaacgtagaattg tgctacgttc tatataacct taatgtgatt tgagcttggc 10680 gtttgtttgtaatgtgagca aagaaaatta aactggttta ctgatcatct tacctgccga 10740 gcgcaattgtaatcgatgtg ccacctgaaa ccccacaggt atttaacctg ggagtccgat 10800 tcatcgacggatgttttgga aattcagcgc cgcgaagtgt aaataaaggg caacagttgg 10860 tggccaagtcttactcgact tggcttggca catatttccg agttccatgc caagttttcg 10920 attcgcttgcaaaaattatg cattgggcac aagtgaatcg tggccgattc tgtattggca 10980 aaaaaaaaaacagcgctcca atagaaagtg aatcttatgt ttgttttcgt ttggctatgc 11040 ttatttttagtcgaacctga taattcattc agtcgcctct tatcgaatgc ttataaaact 11100 ttatagtcactgtttctgca ggtccctcaa aaacagtttc tactgctgat aagaagtttt 11160 cgaagtctggggagtattcg gcattggaaa ggccaaaagt tgtgttttat tatattttga 11220 acatattaaacaggatacat aaaacgagag ttttagattg taattacatt tgtcatatct 11280 tttgctaaattgataagtaa acagaaaata tgactcgatg gatattattg actaataata 11340 tatatttaggggtttggtat gattactttg tactgtgaga tacaagttcg tttgtcccac 11400 agatacttttcaattcatag cttatcctac agatacattt caattcatag cttatcccgt 11460 agatacatttccattcattg cttatcccac agatacattt tagcatattt tttttgaaat 11520 ttgaatttgaaaaaaaagtg tttttttttt ttttgttttg agaactactc gtcttgtcaa 11580 aatatttaactgttcccgac tgaagtgccc accttttcgg ccgccgggtt ctcaagtgca 11640 aaaataatgtataataaaaa gccaagatac gtcggcggtc cgctctcgcc ccacttgttg 11700 ttgctgctgccgctggtgcg tcgctgccgc tgccgcagtc gacgtcgact ccatcgctcc 11760 aatatttaaacggatccatt ggatcgcgca ctcagtcgca ctggagagtc gccatcgcag 11820 ccatcatcatagcattccat tccacttgta gccatcggca gtcgctcaat cgtcagttgg 11880 gacacattatttaacttcat tcttaacgtg agtgaattga tgtgttgggt ggcgatcatg 11940 catatagcataggcaaacaa ctgttctaat ccgcattatc ttaatcacaa taatccggcg 12000 gcttatacagatgttttgcg ttagcagttg gcggctaaaa gcctctgctt gcccacatgc 12060 cagtgaaagttctaatccgg ctcaaacaga cgcacaacaa gcgtatctcg tgcgtggaat 12120 catgaatgaataaatgggtg ttactgttaa ctaacaatgg acctttttac caatcaatcg 12180 tcttatctatcaccagaatt gaaacagaat tagtgaataa cttatggtgc atatcagttg 12240 aaacatgaagattcgtgtga acgatcgtga aagatatggt gttcgaactt taaattaccc 12300 ttgtagtttaccactctcat tagttttgat ttatgtagaa ccaaaatttg gatcgtgact 12360 tgcgattagtattgcaatcg cagtgcattg cccaatctat tgattatctg caacttgtgg 12420 cagactgccgcaataattcg acggacacta tcagctagct ccattgattg agataagccc 12480 gttctcacgcggtgttttac acttcttggc aatcgccaag tcacggccct cgccatataa 12540 aaaatatagtatgaacaatc gggaatcttt tggttttacg atcgaccgac aaagcccatg 12600 tatttcctgttacgtccatt tgggccatat aggcacataa aatgggtgct ccaacgcttg 12660 ccgtgggaaagtgtgctcca attgcaaagt tgtaacattg agcgacattt gatgaaggtt 12720 accgacttttatctcgacaa aaacacacac gaattccaga tgaagcgagc gtgcgtagtt 12780 tgcactgcaagttttttttt tggaacaaat agttttatgt ttatatcatt ttatatcata 12840 ttatattccttattgattga gtgtctgcac gggtcattaa attaagaagc aaaaaaaaaa 12900 aaggtgtcaggaattgcatt ccatactcct acgagtagat atcaatttca cccgatcgtg 12960 gtcaattggtcaattgaagt aattcacaat tgaatcaata caataccata tagggcttca 13020 ttgaagaagatgccagcagg actggatgct catgcatgaa taagttgaac gttgaacgca 13080 agcagaatggatttcagcac acaccgcctg accactttgc tgctcctcct cctggccaca 13140 ggtgagatatcgcaatccag atattgcgat ctaataatga gggaatttct cctgcccaca 13200 gttgccctgggaaatgccca aagcagtcag ctcaccgtcg attcccatga catcaccgtt 13260 ctgctgaacagcaacgagac ttttctggtg ttcgccaagt gagttgccat tgccgggaaa 13320 tccaaatccaaaacatatgg catcgtaaat ctattgtgcc cattacagcg gattgctaga 13380 cagcgacgtggaagttgcgc tgggaacaga ttcggaggat catttgctcc tcgatcccgc 13440 aacgtttgtgtatccagcgg gcagtactcg aaatcagtcg gtggtgataa ctggcctcaa 13500 agccggcaacgtcaaagtgg tcgcagatag cgatgatgcg aacaaagaga tgtgagtaac 13560 ttcacgggaatcccaactgt tcccgtacct aattggaaaa ttcacttatt ttccagtgtg 13620 aaggatgtgttcgtacgcgt gactgtggcc aaatcgagag ctttgatcta cacctccatc 13680 atctttggctgggtttactt tgtggcctgg tcggtgtcct tctatccgca gatctggagc 13740 aactatcgccgcaagtccgt cgagggactg aactttgatt tcctggccct caatatcgtg 13800 ggcttcaccctgtacagcat gttcaactgc ggcctctatt tcatcgagga tctgcagaac 13860 gagtacgaggtgcgatatcc gctgggagtg aatcctgtga tgctcaacga cgtggtcttc 13920 tcactgcatgccatgttcgc cacctgcatt acgatccttc agtgcttttt ctatcaggta 13980 ataatatatatagcaaatac cattcaatag ccttatcgcc gaagtggcaa cagttgtcgc 14040 attgaacactaattgccatc aatcaaaatg ccaaatcatt tgaatcacag cggatagtta 14100 cgatatgaagagtagataag gttttgactt gtaaaacatc catactttgt taaatttgtc 14160 cagagagcacagcaaagggt gtcgttcatt gcctacggaa tattggccat cttcgccgtg 14220 gtggtcgtcgtgtctgccgg tttggccgga ggatccgtca tccattggct ggactttctg 14280 tactactgcagttacgtcaa gctaaccatt accatcatca agtacgtgcc gcaagctctg 14340 atgaactatcgccggaagag cacctccggc tggagcatcg gcaacattct gctggatttc 14400 acgggaggaacgctgagcat gctgcaaatg attctgaatg ctcataatta cggtaggata 14460 tagtctatcaatttgtgatt ttcgaatgaa atcgtgtctg gtttccagat gattgggtgt 14520 cgattttcggtgatcccacc aaattcggac tgggtctgtt ttccgtgctc ttcgatgtgt 14580 tcttcatgctgcagcactat gtgttttaca ggtgattgaa acattgtgtg aatatgatac 14640 ttaatctacgattatgtcat ctccactgta cacttatcat tattgctgtg ctgttttcca 14700 tttctccccaggcattcgag ggaatcctcg agctctgacc tcaccaccgt gaccgatgtt 14760 caaaatcgaacaaatgagtc gccgccgccg agcgaagtga cgactgagaa atattagagc 14820 tgcattatcatatgtctgct gtagagaaag acttttgtgc cagtagcgct ttatgtacat 14880 ttttagaattgtaaatatat ccgtatgccg tagctgccta agctttgtat aattcgtgcg 14940 ttttaattgaaatttagttt gactaaaatt tggaatttca ccattaaata aaacttaatt 15000 ttttgtaggagccagaaatc atacggtaca ttgctcgacc attcaaaggg ctgtgcagtg 15060 aaaccaatttgctgcatacg gcgcgttatt tgcaaactaa taaatagatt gaagtattga 15120 aaaaatttcaaaacagaaat tctaacttgc cgcacaatgg gcagcactgt tcgcactcgg 15180 ccaaatccttatcgatagct tatcgatagc catggatata tgacattaag ttagccaatt 15240 tccggttagttgacatccct ggagcacgga agattcttgc ggacacaaat cgcaactgct 15300 aaataaaatttatttatttg agtgcacagc catgagtctt cacaagtccg cgtcgtttag 15360 cttgacttttaaccagtgag cggagatatt ttattcggtc ttacccaaca aaataatgtt 15420 gcgcctttttgcagaaacac ttcgattgtt tcgcgtagca atagtcgcac aatttttgaa 15480 gctttcaaggagttcctgga tttttgggat atcggcaacg aagtttctgc agagtcagca 15540 gttcgggtctccagcaacgg agctttcaac ttgccgcaga gttttggcaa cgaatccaac 15600 gaatatgcccacctggctac gcctgtggat ccagcctacg gaggcaacaa cacgaacaac 15660 atgatgcagttcacgaacaa tctggaaatt ttggccaaca ataattccga tggcaataac 15720 aaaattaatgcatgcaacaa attcgtctgc cacaaggggt gagcaaattc aaaacacgcg 15780 ctccaatcgataaacattgg ctacggcgat tgttcgcgct gcgtggcgaa tggcaaaatc 15840 caaatagtcggtggccacta cgattctgta gttttttgtt agcgaatttt taatatttag 15900 cctccttccccaacaagatc gcttgatcag atatagccga ctaagatgta tatatcacag 15960 ccaatgtcgtggcacaaaga aaggtacagt gcggcaacaa attgatgatc gaacagtaga 16020 aaccttgcatgtagcaacac gcttgtactt gcatcattcg cgcggccaac ttgtttgtgt 16080 ttgtttatccagccaaggcg cagtttgcca ctaagttttt atttcccttt tacactttag 16140 cactgattccgaggatgact ccacggaggt cgatatcaag gaggatattc cgaaaacggt 16200 ggaggtatcgggatcggaat tgtgagtacc tggtcacgtg gtcacatgtg gtttgcctgg 16260 ttgctaactattattgtttt tattattcca ggaccacgga acccatggcc ttcttgcagg 16320 gattaaacgtgagttgtgct tttaatgtgc aaagctatag cttactaact atttaatatt 16380 attccccgcagtccgggaat ctgatgcagt tcagccaggt gggtaacatc gattagctat 16440 tgcatcttgaagcgctggga cagatcggcc tgcacgagga tcagcaggaa gctggccacc 16500 gccgagaagacattgctgat cagtcgcatg tccagctcgt acaagcccaa gggtttaatt 16560 tggtacttggtcaccgtgac cagcagagta aagccgtgga ctgcctgacg gtagcggctg 16620 tccgcatgctggagattcat ctcctggaga atgactgccg atcttcgggt ggccaccaat 16680 aggtggttgcacaaatgcgt gagcaatgtg atctccgcca gcgagatgga gaggaaaacc 16740 agattgatcagcgatccaag accatcgtac ggcttgccca tgattaaggt gtccgctatg 16800 gcatagtacagactgtagaa acccaccgtt attccgagca ggtggcatat gagcgacaga 16860 atcatggacaaggacattgg ggtcagatac tttcccgaat gcacatatat caacctatag 16920 cgatacgccagctggtcgag ttcatccgcc aaggcgcaaa atcgctgcat gcggtagtat 16980 ttagtgtacaactttagctg gtccttcctc tgcagcagat tcacctcctg cagctgcgct 17040 tccagccgtctgttcagagc gtacagaatc tccttcacca ccaccattgc gccaaagtag 17100 cagttattgagaaaattcga aataattaag ggaaacagcc ggtacaaggt ccagatcaag 17160 ctcatctcgggatgctgccg cctctgttgc agtatgaaag ccacttcaat tgttagagga 17220 aaagccacggtcttgaccag agccaaaacg atggatatgt acagcgacct gctgtccaga 17280 cggaattcttttagggtatc aaagaagggc actttgctca acaccttggc cacatggtca 17340 ctgattatcatttgcgacac atagttaata acagccaccg taatgttcat atagctgtac 17400 agagtggtggcgtccttcag gttgatctga ccctcctggt actccttgta gatttgccgc 17460 ccgtaaaccaagctgaatgc aattgcccac agcgaagcaa aggccagatt tgcctttgag 17520 aagcggaatctttcacgacg gcccgcccga tatcgattgg ccaggagtcc gaagacggtc 17580 ataaagcctatcagtatgat cgtcagaaat ttcaccatac gccgatgcgc gtagtcgctg 17640 gtgaagtccatttctctcga acaattaata caaactgtga gcgcactttc cacagcatta 17700 atatctgcttaattgttttc caactaccca actgatgcca tctagaggac ctgtcaagta 17760 gccggacactatcgggacac atcgcgaaac gcatgtattt caccggccgt ccagaaacca 17820 actgagcatgcgttgtgcta ctactagcca caaacaaaag agcataagaa gcgtgaggga 17880 agcggcattccttgcgtgac tcagccgctg cctgcaattt cataagagcg acatgacgtc 17940 aaagtcgcttcgaagttcac tttcagttgg aggacagaac aaaacactct tatctagccg 18000 attagcacggtgcactcctt cccgtcgtca tcgtttagcg agaatttcaa gcacttgtga 18060 aaaatagaatagaatacaaa acaaatcgcc agtccatttg taactcgagc aagctggaac 18120 atgaagctctatcagctcta tgagcgcaaa gtgtgaaccc ttatatgatt gcgagttaag 18180 ttgacattcaaataatatct tgtttttgct tacagcaatc cgtgctgcgc gaaatgatgc 18240 tgcaggacattcagatccag gcgaacacgc tgcccaagct agagaatcac aacatcggtg 18300 gttattgcttcagcatggtt ctggatgagc cgcccaagtc tctttggatg tactcgattc 18360 cgctgaacaagctctacatc cggatgaaca aggccttcaa cgtggacgtt cagttcaagt 18420 ctaaaatgcccatccaacca cttaatttgc gtgtgttcct ttgcttctcc aatgatgtga 18480 gtgctcccgtggtccgctgt caaaatcacc ttagcgttga gccttgtaag tgaagataac 18540 aatacagatcgaacaggatt atttaactat catttgtaca aacctttagt gacggccaat 18600 aacgcaaaaatgcgcgagag cttgctgcgc agcgagaatc ccaacagtgt atattgtgga 18660 aatgctcagggcaagggaat ttccgagcgt ttttccgttg tagtccccct gaacatgagc 18720 cggtctgtaacccgcagtgg gctcacgcgc cagaccctgg ccttcaagtt cgtctgccaa 18780 aactcgtgtatcgggcgaaa agaaacttcc ttagtcttct gcctggagaa agcatggtaa 18840 ggtgacagcaaaactctaga tggctagaac aaagcttaac gtgttttctt tcttgcagcg 18900 gcgatatcgtgggacagcat gttatacatg ttaaaatatg tacgtgcccc aagcgggatc 18960 gcatccaagacgaacgccag ctcaatagca agaagcgcaa gtccgtgccg gaagccgccg 19020 aagaagatgagccgtccaag gtgcgtcggt gcattgctat aaagacggag gacacggaga 19080 gcaatgatagccgagactgc gacgactccg ccgcagagtg gaacgtgtcg cggacaccgg 19140 atggcgattaccgtctggct attacgtgcc ccaataagga atggctgctg cagagcatcg 19200 agggcatgattaaggaggcg gcggctgaag tcctgcgcaa tcccaaccaa gagaatctac 19260 gtcgccatgccaacaaattg ctgagcctta agagtaagca gtgaatcgga ggacaaagag 19320 attaagctttacttaccgaa ctttcctttc agaacgtgcc tacgagctgc catgacttct 19380 gatctggtcgacaatctccc aggtatcaga tacctttgaa atgtgttgca tctgtggggt 19440 atactacatagctattagta tcttaagttt gtattagtcc ttgttcgtaa ggcgtttaac 19500 ggtgatattccccttttggc atgttcgatg gccgaaaaga aaacattttt atatttttga 19560 tagtatactgttgttaactg cagttctatg tgactacgta acttttgtct accacaacaa 19620 acatactctgtacaaaaaag ccaaaagtga atttattaaa gagttgtcat attttgcaaa 19680 catatcctcgtggtgtacgc caatgcccag agcctactgt acccccaccg tggagcacat 19740 gctatgtgacatgtgtggct tgtgtgcggt caatgcactc aggatgcaac tcagctagct 19800 agctgctaatatgtcaaaat tgctgcgtcg catttacata ctttatttat acccgtatct 19860 gcacgtctttggttttagtt ctatgctttc aaaaaaaaaa aaacaacctc aagcagggcg 19920 catgcgttgcgccagcgttg cacatgtgcg aggatgcaaa aaagtgcaac aaacaccaga 19980 tgttgacactgtgccgctgc agctgcaggc gactttagct tttgccacat gcggcagcta 20040 aatgtttactctagcccacc gatcgctgtt cattgaccta gggcaggggc attaagtgcg 20100 ccctaatcgtaacggaatga tagcctctgt gtccaaaaat tcagccaaag cggatgcact 20160 cacttccatttggggcctgt ccttcttcga ccggctgcca cttccactac cagtttggca 20220 ccacgaaaatgggtcgttca aagtgctcaa aacccagcgg agcaactcac tcaattctcg 20280 ttggacgagcgcacagaaaa gtggttttgg atacgagttg agttcgagag acctttctgc 20340 actgggaacatacatgcggc tttgtgtaac agaataataa agtacgcaaa catatctgta 20400 atacttaaagcacaaagaac aaatataaat gtatcataat ttgtttaatt atttattcga 20460 ggtttccaaacaagtcattc tgataacaaa agttgtaaaa ataaaatcca ctaaaattaa 20520 atatcacccacttctcagaa taagcacagc tgtatatact tcagtatata tttttttcag 20580 tgcacttttcccaagcgatg caatcgcctt agaagcccaa ttaaatacgt ttctttgatt 20640 ggcgggtgccaaaaggttga caattcgaaa gtggcgcaca ctgggaggca gtgactcata 20700 atttacataattatttcggg aagatattaa gactcatact atattcaagc agttgtttat 20760 cattttaaactggcagatac cccatcttta cggaccagat aaagggaaag caaacacggc 20820 tgggctcttatcggctacga tcttcatccg cagttcccac tgtgcgcgtg gggaaaacaa 20880 tatggcccaaacacataaaa aacaacaaaa aaaggaaaca accacagaaa gccgggctaa 20940 gacgtcaggtgaaacgcagt agcttcactc gcgactcggc gcttccactc aaaggtgcta 21000 ccgctgcccactcaaatctg cagctcgtag atacgaaaac cagatagcgt cgagcggctg 21060 gcgatcttcactcaatgggg ggaaatactg ctatagagtc gaaagcttgt acacgtagtt 21120 tggcattcgcagtcgcttgt tggcgttttt agtctgctgc ctgatcttcg acgcgctgca 21180 gctgttttggagtcgccgcg agtgccatat ttgctttgac cgcgaaaatt tctgggctaa 21240 aaacagagatatttgagata cagatacata tatctcatat cacatattag ccaattgtgg 21300 gtgcaacaagctgtgagtga tggtggagac ggcaacgaca acgaccataa cccgcaccac 21360 caccgccgttccggctggtg cagtaacggt aacaggaccc actgcctcgg ccacgcccac 21420 cgcgacacaggcggccgcgc aggcgcatcg caacgatgag accacccggg ccatcttcaa 21480 tctgaaagtcatcgtctttc tgctcctcct gcctctggtc ctgctggccg tctttctcaa 21540 gcacctgttggattacctat tcgcgctggg actcaaggag aaggatgtca gtggcaaggt 21600 ggcactggtgagttgcattc gagtgcccat tggggctaac aaatggctgc aatgagcgtc 21660 tggcaaatgagccattaata aggctagtca gatgcacatc agacatggat gcacttagaa 21720 aatgcagtcgcatttcatgt taagtactga cattaaaaaa gagatatatg tctgtgttta 21780 gatacatctttgggtaccaa attaggttca gatacttcgt aaagaaattg gtaatggtat 21840 actttaatcgttggcttcat gtgaatttgt tttcccagta tccgcttcta agtgatcttg 21900 tatctgacgactacttagcc aaccagaaac gtcacgcact ttccttttcc agtggctgcc 21960 tccgggtttccaccacgccc acctttggct cacccacctt ttcccctttc ccgcttttct 22020 ttgctttttatttctcctct tttttttttt tttgatgtca ctgccattag ggtgcggtcg 22080 atcgcttagtactgtgttat taatgtaaat atttatgcgt ttggtgccca gcttggttag 22140 ttgttggccaattgtttagt tgtgtccaca gagccgcgtc tttggtgcca cggacagtta 22200 atgtgacataatttcgctgt aagcgctgca atcaaagtga atctccagct gaaatcgtgc 22260 tcatggcaaccatatcgcgc tccaataatc acatatgcat cttggggcgt cgaattatgg 22320 agaagtcaattgccaatggg cgccaatgcc actggacaag gtcaagtgat gatgccgctg 22380 ccgatgctccatatcgtaaa gaacctgatc gaattcggaa cccattagca tgcttttcag 22440 gctttttatagtgggcgtgt gccggccata agcgtctcac gtagcgtatt aatgattcac 22500 agcggcccgacttttgtttt agtctcagct ttttttttcg atcgttccct cagatatcgt 22560 tttctcagatacagatacac atacagatac atttttgttg cggttgcaca gtggtatttt 22620 cgggtggcagggactggaga attcccatgc caactgttag cagcaactta attataagat 22680 tgactttcgttgataagttc tattgacatc atggttgcgg aattcgagtt atttcagctc 22740 aaaaataccccctttttcga caccactggc caacggccaa ctgcaaactg gttttgcgtg 22800 tgtcgctatatttatttcca agatgaacga aaagagcgca aaaatgcaaa cctcagaaag 22860 ttcacttttgttttcagtct aatgtttgtg tttacaaaca atagagtgta gaatttcgat 22920 gggccaaagtatctgcaagt gtgtagcatg ccgggtatct ctcagatgcg tagataaaac 22980 tcaactactgttgccgctgt taatttgcat atgatattga aattcttcgg ctgttctata 23040 atcacaacaactgcgcattt gttattgttt tccccattgc tagtcgctaa cgtgccaaac 23100 tctgaattgaactcattccg gcttacattt cgattcaccc aactaccgca cacccaaaac 23160 ggcggctgaggtcacccagt gggcttcaat tacggtcaaa agtcactcaa ttgtgcccca 23220 gagggtcggcccaccgagcg tatgagtaat gccattcata agtcgcctct gccgctgttg 23280 ctgctgctcacataattgtc cgtaaatgag gtttttgttc aatgcgaagt cacattagct 23340 cgagttgattgtttgcaaat taagctaatt aatttacttg agtatacgag tgtaatgtga 23400 gtaacctgtgatttaaaccc aggtgaccgg cggaggcagt gggctgggtc gcgagatctg 23460 cttggaactggcgcggcggg gctgcaagct ggccgtcgtt gatgtcaact ccaagggatg 23520 ttacgaaacggtggagctgc tctccaagat tccacgctgc gttgccaagg cctacaaggt 23580 gagttcactagctgcttgga tatttaatgg tttgataaca agaatcttta ttccagaacg 23640 acgtgtcatcgcctcgcgag cttcaactga tggccgccaa ggtggagaag gaactgggtc 23700 ccgtggacattctggtcaac aatgcctccc tcatgcccat gacttcaaca cccagtctga 23760 agagcgatgaaatcgacaca atactgcagc tcaatctggg ctcctacata atggtgagtg 23820 tgtgcttctgaaaatgggac aaatataaaa cttcttgatt ttgcagacca ccaaggagtt 23880 cctgccgaagatgataaacc gcaagtccgg tcatctggtg gcagtaaatg ccttagcggg 23940 taagcttacttggttaaagt gcttaccact tcattgatac ctatgtatat ataactcgca 24000 tttaggtctagttccactgc caggagcggg catctacacg gccaccaaat acggaatcga 24060 gggcttcatggaatcgctgc gagctgagct gcgattgtcc gactgtgact acgttcgcac 24120 cacggtggccaatgcctatc tgatgaggac cagcggagat cttccactgc tcagtgatgc 24180 ggggtaagattggtttatag tttgggcaga tcacttggtc tcatgcggct actacattta 24240 gcattgccaagagctatccc ggactgccca caccatatgt ggccgagaag attgtcaagg 24300 gcgtgttgctgaacgagcgc atggtgtatg tgccaaaaat attcgcactc agtgtatggc 24360 tgctcaggtgagaattgaat tagcccaggt aaccagcgat tatttctaac gattattgtt 24420 gtcgccttgctttagactgt tgcccaccaa gtggcaggat tacatgctgc ttcgcttcta 24480 ccacttcgatgtgcgcagct cccacctgtt ttactggaag tagggcacag gagaaggcac 24540 atccccacccagaagcattt actcctgttt gtttcccaat tgcagttctt tattcaactg 24600 ttgcttacgctaggtgtaca tgtttagcta tttatacgaa tctttaactt aaattaaatc 24660 tatatcctaacattagaatt acgtccggtt ggcctttcct attttatttc gtataagccg 24720 aagttgttcggagtagcaca tcctctcgga ctgctggacg caggacctcc gttcgtagtg 24780 ccaagtgtagttcaagtggc atcgatggac cagcttggag ccactggagc agtagtagaa 24840 gtaggcgcagttccgtggat gtggcataaa gccatagact ccctcctggc agttgatgat 24900 attctctcgcgtttgcatgc gattgcagga cactagatga gcaggagtac aggccttggc 24960 cagtccagccccctcgtagc agaccatata aggataacat ggtccggcat tgggtaaaag 25020 tcgcagggtaatcgccaatg gttccgcttt ctgagctggc ttcttgacca tcgaggggga 25080 tttagtggttatgcctacgg gatcccggca tctcgacacc aactttcgat ccaaacagcg 25140 ttccaatttttcgtcgtagt aatgaccatc caagcactcg gcctcaaagg atcctggacc 25200 ggcacaatatatgtatttgg agcaattgct agagctggcg acataaactc ccaattgtgg 25260 agcactggcacactcttcga actccagggc actggatcga tgacccagca aggtcaccaa 25320 aataattgttaagaaggtta cagctcccat ttcatttatt tttttaacga ccgaaatagc 25380 gggatgacttctgtagactg acttcatcga tgatgggttg agtatatttt tgcatgtgct 25440 ccaactgataaagaagacaa gttattccat cgattactac gctggttatc gtctggtaga 25500 taccgctaatgagcacatgg cagtaactgc cacgcccact ctgggcggtc tcggtaattt 25560 gcattttcgtagcatacttc gcagcagcag caaagcaacc gagtatttaa tgataccaca 25620 ccgcagcataatgctcgact gggcgccggt tcaataaaaa ttgaaaatgc actcaattcg 25680 caattaagtgtcgccacttc cgtacggaca agcggacaaa cggacggaca agcggacaaa 25740 tggacggataaacggacgga tggatggtcg tcgaacgata ccattcaggc cattcaatcc 25800 attcatcgcagtcatcctca ttattatttc catcgtcatc gtggtcgttg ctggtcggag 25860 ttaagcgatggccatcgatt taatatccga tgagatattc ataacttgca attaggtttg 25920 gtggctctgcgctttacgta aatgattgcg tagccgatta atgaagaatt accagtgcaa 25980 atggctgggatctgtgggca ttatccaatt gaccaactac catgctaccc cactaccatt 26040 accattaccataatgtgcaa tgtgccaatt gggctcaaat taaaagtttt attaattgtc 26100 aattaaacgctgtcgcccag cagctgcttt gtggcataat ttttgggtca atctgcatat 26160 ctgattaacaggttataccg ctcagtctac tacatatacc atgcaccaga tgccgcgggg 26220 cacagacaacaagaagtaaa agaaaggacc ccatatggtg ccgacggctc aagtgattaa 26280 gtgcacgacgagatcttcaa atgcagtgca acatgtgcac aaatacaaaa cacacacaca 26340 cacacacacacacgcatatt gaaaatgtat gtaaattcta attaagattg tggatgaaga 26400 cccccagcaccttgatactt ctgctcaatg cgcattgcgc atgcgcagcc ccgcatccga 26460 agatccataaaaatagctca ctaattattt gtgtgctagg gttacagttc tcataaaaaa 26520 caaacaaactgtcgggcgtt ttatggatct tctgcctcta tggcctcaat gcccccgcga 26580 agttttcgatccccattcga ttcgaaaccg aagaagagct acgaccaatc acttttcaat 26640 tcctatgagcagttgagcat caattgattt cgatatgaaa ataaaataca tttatttatt 26700 atcacattacgtatcacagc cattcgcccg cctacgccct ggcatctgga tcgccacatc 26760 catcgtgcggaccttgtgcc ggcatttccg agctgattag cctccgaatc tcgaccagaa 26820 cccggtccgttcgagcctcc aggttgtcga gggcggtgtt taggtcatcc aagctggaat 26880 tgactctggccatcagacgc tccgagttgt tggtcagctc gatgaggtca tcgaaactgc 26940 tggcctggcgactctccatc gatatcctgt ccagatccag ctgcagctgc tcatcggcgc 27000 tgtccatctgggctttaagg gctggaaaac aactttcgat ttaaatttaa atttttttca 27060 ccctaaatcatgattttcgg tgttattttg tgccatgcga tccgaagtgt aaagcaaatt 27120 tgacttggtttgttttgcta tcgaacataa ttaaagttgc ttaccataaa ccaatttaat 27180 ttaattgtaattgcagctaa ctggcttttg ggtacttttg cttttaacgc caaatgtgaa 27240 atattaagtatattttattt aagcgatggc acctgtaaat tgagatttaa gggggtatat 27300 taaatgggtgaacttgatga tttttttttt tcatcaaacg tttattaaag tctattgctt 27360 aaaaaaatgaaagtaaattg cttgccattt taggaggata tttttgaaaa atcgttacaa 27420 ctttt 2742519 1781 DNA Drosophila melanogaster 19 gaattcggca cgagacgcca tacaaaaagttggaactgag tggaatcgga gtactatata 60 gccagccgat cccttccaga gcgccggaagagtagctcac atccgaaccc acgtccccga 120 gccgatgtcg cggcgggaat agagcgattcgcagtccaaa cacgatgata aaccccattg 180 catccgagtc ggaggccatc aattcggccacctatgtgga caactatatc gattcggtgg 240 aaaatctgcc ggacgacgtg cagcgccagttgtcacgcat ccgcgacata gacgtccagt 300 acagaggcct cattcgcgac gtagaccactactacgacct gtatctgtcc ctgcagaact 360 ccgcggatgc cgggcgacgg tctcgaagcatctccaggat gcaccagagt ctcattcagg 420 cgcaggaact gggcgacgaa aaaatgcagatcgtcaatca tatgcaggag ataatcgacg 480 gcaagctgcg ccagctggac accgaccagcagaacctgga cctgaaggag gaccgcgatc 540 ggtatgcgct cctggacgat ggcacgccttcgaagctgca acgcctgcag agcccgatga 600 gggagcaggg caaccaagcg ggcactggcaacggtggcct aaatggaaac ggcctgcttt 660 cggccaaaga tctgtacgcc ttgggcggctatgcaggtgg tgttgtgcct ggttctaatg 720 ccatgacctc cggcaacggt ggcggctcaacgcccaactc ggagcgctcg agccatgtca 780 gtaatggtgg caacagcggc tccaatggcaatgccagcgg cggaggaggc ggagaactgc 840 agcgcacagg tagcaagcgg tcgaggaggcgaaacgagag tgttgttaac aacggaagct 900 ctctggagat gggcggcaac gagtccaactcggcaaatga agccagtggc agtggtggtg 960 gcagtggcga gcgcaaatcc tcgttgggcggtgccagtgg agcgggacag ggacgaaagg 1020 ccagtctgca gtcggcttct ggcagtttggctagcggctc tgcagccacg agcagtggag 1080 cagccggagg tggtggtgcc aacggagccggcgtagttgg tggcaataat tccggcaaga 1140 agaaaaagcg caaggtacgc ggttctggggcttcaaatgc caatgccagt acgcgagagg 1200 agacgccgcc gccggagacc attgatccggacgagccgac ctactgtgtc tgcaatcaga 1260 tctcctttgg cgagatgatc ctgtgcgacaatgacctgtg ccccatcgag tggttccatt 1320 tttcgtgcgt ctccctggta ctaaaaccaaaaggcaagtg gttctgcccc aactgccgcg 1380 gagaacggcc aaatgtaatg aaacccaaggcgcagttcct caaagaactg gagcgctaca 1440 acaaggaaaa ggaggagaag acctagtctattaggccagc ctatccaacc cattgctctg 1500 tgtctaacac caggctctgt aaaatattcgatcctaagat ttaccttaat gtatatttag 1560 tgactttctt agacccgatc ccttttcgactttcccctct ttcacccagt ttagatccct 1620 cgcttctatg gttataggtc gtcagttttcatttaaagtt tctgtacaaa caatatcttt 1680 ctcaatgtaa acacacaaaa actcgtataattagagtaca cctaaactta atttatggta 1740 ataaacgttg atattcaaaa aaaaaaaaaaaaaaaactcg a 1781 20 433 PRT Drosophila melanogaster 20 Met Ile Asn ProIle Ala Ser Glu Ser Glu Ala Ile Asn Ser Ala Thr 1 5 10 15 Tyr Val AspAsn Tyr Ile Asp Ser Val Glu Asn Leu Pro Asp Asp Val 20 25 30 Gln Arg GlnLeu Ser Arg Ile Arg Asp Ile Asp Val Gln Tyr Arg Gly 35 40 45 Leu Ile ArgAsp Val Asp His Tyr Tyr Asp Leu Tyr Leu Ser Leu Gln 50 55 60 Asn Ser AlaAsp Ala Gly Arg Arg Ser Arg Ser Ile Ser Arg Met His 65 70 75 80 Gln SerLeu Ile Gln Ala Gln Glu Leu Gly Asp Glu Lys Met Gln Ile 85 90 95 Val AsnHis Met Gln Glu Ile Ile Asp Gly Lys Leu Arg Gln Leu Asp 100 105 110 ThrAsp Gln Gln Asn Leu Asp Leu Lys Glu Asp Arg Asp Arg Tyr Ala 115 120 125Leu Leu Asp Asp Gly Thr Pro Ser Lys Leu Gln Arg Leu Gln Ser Pro 130 135140 Met Arg Glu Gln Gly Asn Gln Ala Gly Thr Gly Asn Gly Gly Leu Asn 145150 155 160 Gly Asn Gly Leu Leu Ser Ala Lys Asp Leu Tyr Ala Leu Gly GlyTyr 165 170 175 Ala Gly Gly Val Val Pro Gly Ser Asn Ala Met Thr Ser GlyAsn Gly 180 185 190 Gly Gly Ser Thr Pro Asn Ser Glu Arg Ser Ser His ValSer Asn Gly 195 200 205 Gly Asn Ser Gly Ser Asn Gly Asn Ala Ser Gly GlyGly Gly Gly Glu 210 215 220 Leu Gln Arg Thr Gly Ser Lys Arg Ser Arg ArgArg Asn Glu Ser Val 225 230 235 240 Val Asn Asn Gly Ser Ser Leu Glu MetGly Gly Asn Glu Ser Asn Ser 245 250 255 Ala Asn Glu Ala Ser Gly Ser GlyGly Gly Ser Gly Glu Arg Lys Ser 260 265 270 Ser Leu Gly Gly Ala Ser GlyAla Gly Gln Gly Arg Lys Ala Ser Leu 275 280 285 Gln Ser Ala Ser Gly SerLeu Ala Ser Gly Ser Ala Ala Thr Ser Ser 290 295 300 Gly Ala Ala Gly GlyGly Gly Ala Asn Gly Ala Gly Val Val Gly Gly 305 310 315 320 Asn Asn SerGly Lys Lys Lys Lys Arg Lys Val Arg Gly Ser Gly Ala 325 330 335 Ser AsnAla Asn Ala Ser Thr Arg Glu Glu Thr Pro Pro Pro Glu Thr 340 345 350 IleAsp Pro Asp Glu Pro Thr Tyr Cys Val Cys Asn Gln Ile Ser Phe 355 360 365Gly Glu Met Ile Leu Cys Asp Asn Asp Leu Cys Pro Ile Glu Trp Phe 370 375380 His Phe Ser Cys Val Ser Leu Val Leu Lys Pro Lys Gly Lys Trp Phe 385390 395 400 Cys Pro Asn Cys Arg Gly Glu Arg Pro Asn Val Met Lys Pro LysAla 405 410 415 Gln Phe Leu Lys Glu Leu Glu Arg Tyr Asn Lys Glu Lys GluGlu Lys 420 425 430 Thr 21 2666 DNA Drosophila melanogaster 21cattttgtac agtctaaacg gggattcgcg taaactacgc agaaatataa acaaacaaaa 60actagtagac tatagaatat aaacagtttc ctaccaatgg agacttgtga agtggaggga 120gaggcggaga cgctggtgag acgcttctcc gtcagctgcg agcaattgga gctggaagcg 180agaattcagc aaagcgctct gtccacctac catcgcttgg atgcggtcaa cgggctgtcc 240accagcgagg cagatgccca ggagtggctg tgttgcgccg tctacagcga actgcagcgc 300tcgaagatgc gcgatattag ggagtccatc aacgaggcaa acgattcggt ggccaagaac 360tgctgctgga acgtgtcact aacccgtctg ctgcgcagct ttaagatgaa cgtgtcccag 420tttctacgcc gcatggagca ctggaattgg ctgacccaaa acgagaacac tttccagctg 480gaggttgagg aactgcgttg tcgacttggt attacttcga cgctgctgcg gcattataag 540cacatctttc ggagcctgtt cgttcacccg gcaagggtgc ggacccgggt gccgcgaatc 600actaccaagc gctgtatgag ttcggttggt tgctcttcct ggtcattcgc aacgagttac 660ccggttttgc gattacaaac ctgatcaacg gctgtcaggt gctcgtttgc acaatggatc 720tccttttcgt gaacgcctta gaggtgcccc gatccgtagt tatccgccgg gagttctctg 780gagtgcccaa gaattgggac accgaagact tcaatcctat tttgctaaat aaatatagcg 840tgctagaagc actgggagaa ctgattcccg agctaccagc gaagggagtg gtgcaaatga 900agaacgcctt tttccacaaa gccttaataa tgctctatat ggaccatagt ctagttggag 960acgacaccca tatgcgggag atcattaagg agggtatgct agatatcaat ctggaaaact 1020taaatcgcaa atacaccaat caagtagccg acattagtga gatggacgag cgtgtgctgc 1080tcagcgtcca gggggcgata gagaccaaag gggactctcc taaaagccca cagctcgcct 1140tccaaacaag ctcgtcacct tcgcatagga agctgtccac ccatgatcta ccagcaagtc 1200ttcccctaag cattataaaa gcattcccca agaaggaaga cgcagataaa attgtaaatt 1260atttagatca aactctggaa gaaatgaatc ggacctttac catggccgtg aaagattttt 1320tggatgctaa gttgtctgga aaacgattcc gccaggccag aggcctttac tacaaatatt 1380tgcagaaaat tttgggaccg gagctggttc aaaaaccaca gctgaagatt ggtcagttaa 1440tgaagcagcg caagcttacc gccgccctgt tagcttgctg cctggaactg gcacttcacg 1500tccaccacaa actagtggaa ggcctaaggt ttccctttgt cctgcactgc ttttcactgg 1560acgcctacga ctttcaaaag attctagagt tggtggtgcg ctacgatcat ggttttctgg 1620gcagagagct gatcaagcac ctggatgtgg tggaggaaat gtgcctggag tcgttgattt 1680tccgcaagag ctcacagctg tggtgggagc taaatcaaag acttccccgc tacaaggaag 1740tcgatgcaga aacagaagac aaggagaact tttcaacagg ctcaagcatc tgccttcgaa 1800agttctacgg actggccaac cggcggctgc tccttctgtg taagagtctt tgcctcgtgg 1860attcctttcc ccaaatatgg cacctggccg agcactcttt caccttagag agtagccgtc 1920tgctccgcaa tcgacacctg gaccaactgc tgttgtgcgc catacatctt catgttcggc 1980tcgagaagct tcacctcact ttcagcatga ttatccagca ctatcgccga cagccgcact 2040ttcggagaag cgcttaccga gaggttagct tgggcaatgg tcagaccgct gatattatca 2100ctttctacaa cagtgtgtat gtccaaagta tgggcaacta tggccgccac ctggagtgtg 2160cgcaaacacg caagtcactg gaagaatcac agagtagcgt tggtattctg acggaaaaca 2220acttccaacg aattgagcat gagagccaac atcagcatat cttcaccgcc ccctcccagg 2280gtatgccaaa gtggctcctg ctccagtcat ccaccttcat ctcccgccgc atcaccactt 2340tccttgcaaa gctcgcccaa cgtaaagcgt gctgcttcga gtaacgactt gatgagagag 2400atcaagcgac caaacatcct gcggcgtcgc cagctttcag tgatctaata accaatcaaa 2460aaaggcttaa atacttggct gcattttacg cagctagctt agtatatttc ttaaactcaa 2520aaatggtaat taaataatgt ttaaattata gatattttat taacttgttc aagtaagtta 2580aaagcttttg cttttgtaaa aataaaggaa taactgccac tcgtagttta aataaatttt 2640taaaaaaaaa aaaaaaaaaa ctcgag 2666 22 556 PRT Drosophila melanogaster 22Met Asp Leu Leu Phe Val Asn Ala Leu Glu Val Pro Arg Ser Val Val 1 5 1015 Ile Arg Arg Glu Phe Ser Gly Val Pro Lys Asn Trp Asp Thr Glu Asp 20 2530 Phe Asn Pro Ile Leu Leu Asn Lys Tyr Ser Val Leu Glu Ala Leu Gly 35 4045 Glu Leu Ile Pro Glu Leu Pro Ala Lys Gly Val Val Gln Met Lys Asn 50 5560 Ala Phe Phe His Lys Ala Leu Ile Met Leu Tyr Met Asp His Ser Leu 65 7075 80 Val Gly Asp Asp Thr His Met Arg Glu Ile Ile Lys Glu Gly Met Leu 8590 95 Asp Ile Asn Leu Glu Asn Leu Asn Arg Lys Tyr Thr Asn Gln Val Ala100 105 110 Asp Ile Ser Glu Met Asp Glu Arg Val Leu Leu Ser Val Gln GlyAla 115 120 125 Ile Glu Thr Lys Gly Asp Ser Pro Lys Ser Pro Gln Leu AlaPhe Gln 130 135 140 Thr Ser Ser Ser Pro Ser His Arg Lys Leu Ser Thr HisAsp Leu Pro 145 150 155 160 Ala Ser Leu Pro Leu Ser Ile Ile Lys Ala PhePro Lys Lys Glu Asp 165 170 175 Ala Asp Lys Ile Val Asn Tyr Leu Asp GlnThr Leu Glu Glu Met Asn 180 185 190 Arg Thr Phe Thr Met Ala Val Lys AspPhe Leu Asp Ala Lys Leu Ser 195 200 205 Gly Lys Arg Phe Arg Gln Ala ArgGly Leu Tyr Tyr Lys Tyr Leu Gln 210 215 220 Lys Ile Leu Gly Pro Glu LeuVal Gln Lys Pro Gln Leu Lys Ile Gly 225 230 235 240 Gln Leu Met Lys GlnArg Lys Leu Thr Ala Ala Leu Leu Ala Cys Cys 245 250 255 Leu Glu Leu AlaLeu His Val His His Lys Leu Val Glu Gly Leu Arg 260 265 270 Phe Pro PheVal Leu His Cys Phe Ser Leu Asp Ala Tyr Asp Phe Gln 275 280 285 Lys IleLeu Glu Leu Val Val Arg Tyr Asp His Gly Phe Leu Gly Arg 290 295 300 GluLeu Ile Lys His Leu Asp Val Val Glu Glu Met Cys Leu Glu Ser 305 310 315320 Leu Ile Phe Arg Lys Ser Ser Gln Leu Trp Trp Glu Leu Asn Gln Arg 325330 335 Leu Pro Arg Tyr Lys Glu Val Asp Ala Glu Thr Glu Asp Lys Glu Asn340 345 350 Phe Ser Thr Gly Ser Ser Ile Cys Leu Arg Lys Phe Tyr Gly LeuAla 355 360 365 Asn Arg Arg Leu Leu Leu Leu Cys Lys Ser Leu Cys Leu ValAsp Ser 370 375 380 Phe Pro Gln Ile Trp His Leu Ala Glu His Ser Phe ThrLeu Glu Ser 385 390 395 400 Ser Arg Leu Leu Arg Asn Arg His Leu Asp GlnLeu Leu Leu Cys Ala 405 410 415 Ile His Leu His Val Arg Leu Glu Lys LeuHis Leu Thr Phe Ser Met 420 425 430 Ile Ile Gln His Tyr Arg Arg Gln ProHis Phe Arg Arg Ser Ala Tyr 435 440 445 Arg Glu Val Ser Leu Gly Asn GlyGln Thr Ala Asp Ile Ile Thr Phe 450 455 460 Tyr Asn Ser Val Tyr Val GlnSer Met Gly Asn Tyr Gly Arg His Leu 465 470 475 480 Glu Cys Ala Gln ThrArg Lys Ser Leu Glu Glu Ser Gln Ser Ser Val 485 490 495 Gly Ile Leu ThrGlu Asn Asn Phe Gln Arg Ile Glu His Glu Ser Gln 500 505 510 His Gln HisIle Phe Thr Ala Pro Ser Gln Gly Met Pro Lys Trp Leu 515 520 525 Leu LeuGln Ser Ser Thr Phe Ile Ser Arg Arg Ile Thr Thr Phe Leu 530 535 540 AlaLys Leu Ala Gln Arg Lys Ala Cys Cys Phe Glu 545 550 555 23 9 PRT AnyInsect 23 Arg Ile Cys Ser Cys Pro Lys Arg Asp 1 5 24 9 PRT Any Insect 24Lys Ile Cys Ser Cys Pro Lys Arg Asp 1 5 25 9 PRT Any Insect 25 Arg ValCys Ser Cys Pro Lys Arg Asp 1 5 26 9 PRT Any Insect 26 Lys Val Cys SerCys Pro Lys Arg Asp 1 5 27 9 PRT Any Insect 27 Arg Ile Cys Thr Cys ProLys Arg Asp 1 5 28 9 PRT Any Insect 28 Lys Ile Cys Thr Cys Pro Lys ArgAsp 1 5 29 9 PRT Any Insect 29 Arg Val Cys Thr Cys Pro Lys Arg Asp 1 530 9 PRT Any Insect 30 Lys Val Cys Thr Cys Pro Lys Arg Asp 1 5 31 7 PRTAny Insect misc_feature (2)..(2) “X” is any amino acid 31 Phe Xaa CysLys Asn Ser Cys 1 5 32 7 PRT Any Insect misc_feature (2)..(2) “X” is anyamino acid 32 Phe Xaa Cys Gln Asn Ser Cys 1 5 33 393 PRT Homo sapiens 33Met Glu Glu Pro Gln Ser Asp Pro Ser Val Glu Pro Pro Leu Ser Gln 1 5 1015 Glu Thr Phe Ser Asp Leu Trp Lys Leu Leu Pro Glu Asn Asn Val Leu 20 2530 Ser Pro Leu Pro Ser Gln Ala Met Asp Asp Leu Met Leu Ser Pro Asp 35 4045 Asp Ile Glu Gln Trp Phe Thr Glu Asp Pro Gly Pro Asp Glu Ala Pro 50 5560 Arg Met Pro Glu Ala Ala Pro Arg Val Ala Pro Ala Pro Ala Ala Pro 65 7075 80 Thr Pro Ala Ala Pro Ala Pro Ala Pro Ser Trp Pro Leu Ser Ser Ser 8590 95 Val Pro Ser Gln Lys Thr Tyr Gln Gly Ser Tyr Gly Phe Arg Leu Gly100 105 110 Phe Leu His Ser Gly Thr Ala Lys Ser Val Thr Cys Thr Tyr SerPro 115 120 125 Ala Leu Asn Lys Met Phe Cys Gln Leu Ala Lys Thr Cys ProVal Gln 130 135 140 Leu Trp Val Asp Ser Thr Pro Pro Pro Gly Thr Arg ValArg Ala Met 145 150 155 160 Ala Ile Tyr Lys Gln Ser Gln His Met Thr GluVal Val Arg Arg Cys 165 170 175 Pro His His Glu Arg Cys Ser Asp Ser AspGly Leu Ala Pro Pro Gln 180 185 190 His Leu Ile Arg Val Glu Gly Asn LeuArg Val Glu Tyr Leu Asp Asp 195 200 205 Arg Asn Thr Phe Arg His Ser ValVal Val Pro Tyr Glu Pro Pro Glu 210 215 220 Val Gly Ser Asp Cys Thr ThrIle His Tyr Asn Tyr Met Cys Asn Ser 225 230 235 240 Ser Cys Met Gly GlyMet Asn Arg Arg Pro Ile Leu Thr Ile Ile Thr 245 250 255 Leu Glu Asp SerSer Gly Asn Leu Leu Gly Arg Asn Ser Phe Glu Val 260 265 270 Arg Val CysAla Cys Pro Gly Arg Asp Arg Arg Thr Glu Glu Glu Asn 275 280 285 Leu ArgLys Lys Gly Glu Pro His His Glu Leu Pro Pro Gly Ser Thr 290 295 300 LysArg Ala Leu Pro Asn Asn Thr Ser Ser Ser Pro Gln Pro Lys Lys 305 310 315320 Lys Pro Leu Asp Gly Glu Tyr Phe Thr Leu Gln Ile Arg Gly Arg Glu 325330 335 Arg Phe Glu Met Phe Arg Glu Leu Asn Glu Ala Leu Glu Leu Lys Asp340 345 350 Ala Gln Ala Gly Lys Glu Pro Gly Gly Ser Arg Ala His Ser SerHis 355 360 365 Leu Lys Ser Lys Lys Gly Gln Ser Thr Ser Arg His Lys LysLeu Met 370 375 380 Phe Lys Thr Glu Gly Pro Asp Ser Asp 385 390 34 363PRT Xenopus laevis 34 Met Glu Pro Ser Ser Glu Thr Gly Met Asp Pro ProLeu Ser Gln Glu 1 5 10 15 Thr Phe Glu Asp Leu Trp Ser Leu Leu Pro AspPro Leu Gln Thr Val 20 25 30 Thr Cys Arg Leu Asp Asn Leu Ser Glu Phe ProAsp Tyr Pro Leu Ala 35 40 45 Ala Asp Met Thr Val Leu Gln Glu Gly Leu MetGly Asn Ala Val Pro 50 55 60 Thr Val Thr Ser Cys Ala Val Pro Ser Thr AspAsp Tyr Ala Gly Lys 65 70 75 80 Tyr Gly Leu Gln Leu Asp Phe Gln Gln AsnGly Thr Ala Lys Ser Val 85 90 95 Thr Cys Thr Tyr Ser Pro Glu Leu Asn LysLeu Phe Cys Gln Leu Ala 100 105 110 Lys Thr Cys Pro Leu Leu Val Arg ValGlu Ser Pro Pro Pro Arg Gly 115 120 125 Ser Ile Leu Arg Ala Thr Ala ValTyr Lys Lys Ser Glu His Val Ala 130 135 140 Glu Val Val Lys Arg Cys ProHis His Glu Arg Ser Val Glu Pro Gly 145 150 155 160 Glu Asp Ala Ala ProPro Ser His Leu Met Arg Val Glu Gly Asn Leu 165 170 175 Gln Ala Tyr TyrMet Glu Asp Val Asn Ser Gly Arg His Ser Val Cys 180 185 190 Val Pro TyrGlu Gly Pro Gln Val Gly Thr Glu Cys Thr Thr Val Leu 195 200 205 Tyr AsnTyr Met Cys Asn Ser Ser Cys Met Gly Gly Met Asn Arg Arg 210 215 220 ProIle Leu Thr Ile Ile Thr Leu Glu Thr Pro Gln Gly Leu Leu Leu 225 230 235240 Gly Arg Arg Cys Phe Glu Val Arg Val Cys Ala Cys Pro Gly Arg Asp 245250 255 Arg Arg Thr Glu Glu Asp Asn Tyr Thr Lys Lys Arg Gly Leu Lys Pro260 265 270 Ser Gly Lys Arg Glu Leu Ala His Pro Pro Ser Ser Glu Pro ProLeu 275 280 285 Pro Lys Lys Arg Leu Val Val Val Asp Asp Asp Glu Glu IlePhe Thr 290 295 300 Leu Arg Ile Lys Gly Arg Ser Arg Tyr Glu Met Ile LysLys Leu Asn 305 310 315 320 Asp Ala Leu Glu Leu Gln Glu Ser Leu Asp GlnGln Lys Val Thr Ile 325 330 335 Lys Cys Arg Lys Cys Arg Asp Glu Ile LysPro Lys Lys Gly Lys Lys 340 345 350 Leu Leu Val Lys Asp Glu Gln Pro AspSer Glu 355 360 35 564 PRT Loligo forbesi 35 Met Ser Gln Gly Thr Ser ProAsn Ser Gln Glu Thr Phe Asn Leu Leu 1 5 10 15 Trp Asp Ser Leu Glu GlnVal Thr Ala Asn Glu Tyr Thr Gln Ile His 20 25 30 Glu Arg Gly Val Gly TyrGlu Tyr His Glu Ala Glu Pro Asp Gln Thr 35 40 45 Ser Leu Glu Ile Ser AlaTyr Arg Ile Ala Gln Pro Asp Pro Tyr Gly 50 55 60 Arg Ser Glu Ser Tyr AspLeu Leu Asn Pro Ile Ile Asn Gln Ile Pro 65 70 75 80 Ala Pro Met Pro IleAla Asp Thr Gln Asn Asn Pro Leu Val Asn His 85 90 95 Cys Pro Tyr Glu AspMet Pro Val Ser Ser Thr Pro Tyr Ser Pro His 100 105 110 Asp His Val GlnSer Pro Gln Pro Ser Val Pro Ser Asn Ile Lys Tyr 115 120 125 Pro Gly GluTyr Val Phe Glu Met Ser Phe Ala Gln Pro Ser Lys Glu 130 135 140 Thr LysSer Thr Thr Trp Thr Tyr Ser Glu Lys Leu Asp Lys Leu Tyr 145 150 155 160Val Arg Met Ala Thr Thr Cys Pro Val Arg Phe Lys Thr Ala Arg Pro 165 170175 Pro Pro Ser Gly Cys Gln Ile Arg Ala Met Pro Ile Tyr Met Lys Pro 180185 190 Glu His Val Gln Glu Val Val Lys Arg Cys Pro Asn His Ala Thr Ala195 200 205 Lys Glu His Asn Glu Lys His Pro Ala Pro Leu His Ile Val ArgCys 210 215 220 Glu His Lys Leu Ala Lys Tyr His Glu Asp Lys Tyr Ser GlyArg Gln 225 230 235 240 Ser Val Leu Ile Pro His Glu Met Pro Gln Ala GlySer Glu Trp Val 245 250 255 Val Asn Leu Tyr Gln Phe Met Cys Leu Gly SerCys Val Gly Gly Pro 260 265 270 Asn Arg Arg Pro Ile Gln Leu Val Phe ThrLeu Glu Lys Asp Asn Gln 275 280 285 Val Leu Gly Arg Arg Ala Val Glu ValArg Ile Cys Ala Cys Pro Gly 290 295 300 Arg Asp Arg Lys Ala Asp Glu LysAla Ser Leu Val Ser Lys Pro Pro 305 310 315 320 Ser Pro Lys Lys Asn GlyPhe Pro Gln Arg Ser Leu Val Leu Thr Asn 325 330 335 Asp Ile Thr Lys IleThr Pro Lys Lys Arg Lys Ile Asp Asp Glu Cys 340 345 350 Phe Thr Leu LysVal Arg Gly Arg Glu Asn Tyr Glu Ile Leu Cys Lys 355 360 365 Leu Arg AspIle Met Glu Leu Ala Ala Arg Ile Pro Glu Ala Glu Arg 370 375 380 Leu LeuTyr Lys Gln Glu Arg Gln Ala Pro Ile Gly Arg Leu Thr Ser 385 390 395 400Leu Pro Ser Ser Ser Ser Asn Gly Ser Gln Asp Gly Ser Arg Ser Ser 405 410415 Thr Ala Phe Ser Thr Ser Asp Ser Ser Gln Val Asn Ser Ser Gln Asn 420425 430 Asn Thr Gln Met Val Asn Gly Gln Val Pro His Glu Glu Glu Thr Pro435 440 445 Val Thr Lys Cys Glu Pro Thr Glu Asn Thr Ile Ala Gln Trp LeuThr 450 455 460 Lys Leu Gly Leu Gln Ala Tyr Ile Asp Asn Phe Gln Gln LysGly Leu 465 470 475 480 His Asn Met Phe Gln Leu Asp Glu Phe Thr Leu GluAsp Leu Gln Ser 485 490 495 Met Arg Ile Gly Thr Gly His Arg Asn Lys IleTrp Lys Ser Leu Leu 500 505 510 Asp Tyr Arg Arg Leu Leu Ser Ser Gly ThrGlu Ser Gln Ala Leu Gln 515 520 525 His Ala Ala Ser Asn Ala Ser Thr LeuSer Val Gly Ser Gln Asn Ser 530 535 540 Tyr Cys Pro Gly Phe Tyr Glu ValThr Arg Tyr Thr Tyr Lys His Thr 545 550 555 560 Ile Ser Tyr Leu

What is claimed is:
 1. An isolated nucleic acid molecule comprising anucleic acid sequence selected from the group consisting of: (a) anucleic acid sequence that encodes a polypeptide comprising at least 7contiguous amino acids of any one of SEQ ID NOs 4, 6, 8, and 10; (b) anucleic acid sequence that encodes a polypeptide comprising at least 7contiguous amino acids of SEQ ID NO:2, wherein the isolated nucleic acidmolecule is less than 15 kb in size; (c) a nucleic acid sequence thatencodes a polypeptide comprising at least 9 contiguous amino acids thatshare 100% sequence similarity with 9 contiguous amino acids of any oneof SEQ ID NOs 4, 6, 8, and 10; (d) a nucleic acid sequence that encodesa polypeptide comprising at least 9 contiguous amino acids that share100% sequence similarity with 9 contiguous amino acids of SEQ ID NO 2;wherein the isolated nucleic acid molecule is less than 15 kb in size;(e) at least 20 contiguous nucleotides of any of nucleotides 1-111 ofSEQ ID NO:1, 1-120 of SEQ ID NO:3, 1-93 of SEQ ID NO:5, and 1-1225 ofSEQ ID NO:18; (f) a nucleic acid sequence that encodes a polypeptidecomprising an amino acid sequence having at least 80% sequencesimilarity with a sequence selected from the group consisting of SEQ IDNO:20 and SEQ ID NO:22; and (g) the complement of the nucleic acid ofany of (a)-(f).
 2. The isolated nucleic acid molecule of claim 1 that isRNA.
 3. The isolated nucleic acid molecule of claim 1 wherein thenucleic acid sequence has at least 50% sequence identity with a sequenceselected from the group consisting of any of SEQ ID NOs:1, 3, 5, 7, 9,18, 19 and
 21. 4. The isolated nucleic acid molecule of claim 1 whereinthe nucleic acid sequence encodes a polypeptide comprising an amino acidsequence selected from the group consisting of: RICSCPKRD, KICSCPKRD,RVCSCPKRD, KVCSCPKRD, RICTCPKRD, KICTCPKRD, RVCTCPKRD, KVCTCPKRD,FXCKNSC and FXCQNSC, wherein X is any amino acid.
 5. The isolatednucleic acid molecule of claim 1 wherein the nucleic acid sequenceencodes at least 17 contiguous amino acids of any of SEQ ID NOs 2, 4, 6,8, and
 10. 6. The isolated nucleic acid molecule of claim 1 wherein thenucleic acid sequence encodes a polypeptide comprising at least 19 aminoacids that share 100% sequence similarity with 19 amino acids of any ofSEQ ID NOs 2, 4, 6, 8, and
 10. 7. The isolated nucleic acid molecule ofclaim 1 wherein the nucleic acid sequence encodes a polypeptide havingat least 50% sequence identity with any of SEQ ID NOs 2, 4, 6, 8, and10.
 8. The isolated nucleic acid molecule of claim 1 wherein the nucleicacid sequence encodes at least one p53 domain selected from the groupconsisting of an activation domain, a DNA binding domain, a linkerdomain, an oligomerization domain, and a basic regulatory domain.
 9. Theisolated nucleic acid molecule of claim 1 wherein the nucleic acidsequence encodes a constitutively active p53.
 10. The isolated nucleicacid molecule of claim 1 wherein the nucleic acid sequence encodes adominant negative p53.
 11. A vector comprising the nucleic acid moleculeof claim
 1. 12. A host cell comprising the vector of claim
 11. 13. Aprocess for producing a p53 polypeptide comprising culturing the hostcell of claim 8 under conditions suitable for expression of the p53polypeptide and recovering the polypeptide.
 14. A purified polypeptidecomprising an amino acid sequence selected from the group consisting of:a) at least 7 contiguous amino acids of any one of SEQ ID NOs 2, 4, 6,8, and 10; b) at least 9 contiguous amino acids that share 100% sequencesimilarity with at least 9 contiguous amino acids of any one of SEQ IDNOs 2, 4, 6, 8, and 10; and c) at least 10 contiguous amino acids of asequence selected from the group consisting of SEQ ID NO:20 and SEQ IDNO:22.
 15. The purified polypeptide of claim 14 wherein the amino acidsequence is selected from the group consisting of RICSCPKRD, KICSCPKRD,RVCSCPKRD, KVCSCPKRD, RICTCPKRD, KICTCPKRD, RVCTCPKRD, KVCTCPKRD,FXCKNSC and FXCQNSC, wherein X is any amino acid.
 16. The purifiedpolypeptide of claim 14 wherein the amino acid sequence has at least 50%sequence similarity with a sequence selected from the group consistingof SEQ ID NOs 2, 4, 6, 8, and
 10. 17. A method for detecting a candidatecompound or molecule that modulates p53 activity said method comprisingcontacting a p53 polypeptide, or a nucleic acid encoding the p53polypeptide, with one or more candidate compounds or molecules, anddetecting any interaction between the candidate compound or molecule andthe p53 polypeptide or nucleic acid; wherein the p53 polypeptidecomprises an amino acid sequence selected from the group consisting of:a) at least 7 contiguous amino acids of any one of SEQ ID NOs 2, 4, 6,8, and 10; and b) at least 9 contiguous amino acids that share 100%sequence similarity with at least 9 contiguous amino acids of any one ofSEQ ID NOs 2, 4, 6, 8, and
 10. 18. The method of claim 17 wherein thecandidate compound or molecule is a putative pharmaceutical agent. 19.The method of claim 17 wherein the contacting comprises administeringthe candidate compound or molecule to cultured host cells that have beengenetically engineered to express the p53 protein.
 20. The method ofclaim 17 wherein the contacting comprises administering the candidatecompound or molecule to an insect has been genetically engineered toexpress the p53 protein.
 21. The method of claim 20 wherein thecandidate compound is a putative pesticide.
 22. A first insect that hasbeen genetically modified to express or mis-express a p53 protein, orthe progeny of the insect that has inherited the p53 protein expressionor mis-expression, wherein the p53 protein comprises an amino acidsequence selected from the group consisting of: a) at least 7 contiguousamino acids of any one of SEQ ID NOs 2, 4, 6, 8, and 10; and b) at least9 contiguous amino acids that share 100% sequence similarity with atleast 9 contiguous amino acids of any one of SEQ ID NOs 2, 4, 6, 8, and10.
 23. The insect of claim 22 wherein said insect is Drosophila thathas been genetically modified to express a dominant negative p53 havinga mutation selected from the group consisting of R155H, H159N, andR266T.
 24. A method for studying p53 activity comprising detecting thephenotype caused by the expression or mis-expression of the p53 proteinin the first insect of claim
 22. 25. The method of claim 24 additionallycomprising observing a second insect having the same geneticmodification as the first insect which causes the expression ormis-expression of the p53 protein, and wherein the second animaladditionally comprises a mutation in a gene of interest, whereindifferences, if any, between the phenotype of the first animal and thephenotype of the second animal identifies the gene of interest ascapable of modifying the function of the gene encoding the p53 protein.26. The method of claim 24 additionally comprising administering one ormore candidate compounds or molecules to the insect or its progeny andobserving any changes in p53 activity of the insect or its progeny. 27.A method of modulating p53 activity comprising contacting an insect cellwith the isolated nucleic acid molecule of claim 1, wherein the isolatednucleic acid molecule is dsRNA derived from a coding region of a nucleicacid sequence selected from the group consisting of SEQ ID NO:1, 3, 5,7, and
 9. 28. The method of claim 27 wherein cultured insect cells arecontacted with the dsRNA and apoptosis of the cultured cells is assayed.